1 Introduction

Normal pressure hydrocephalus (NPH) is a neurodegenerative disease characterized by ventriculomegaly with normal cerebrospinal fluid (CSF) opening pressure and the clinical triad of cognitive impairment, symmetric gait disturbance and urinary incontinence, also known as Hakim’s triad [1]. NPH is one of the leading causes of treatable dementia in the elderly, especially if there is no other neurologic comorbidity [1]. Cognitive impairment has variable severity and manifests mainly through attention deficit, decrease of psychomotor speed and disruption of executive functions [2]. Disturbances in memory may be secondary to decreased frontal lobe functional integration [3]. Other common symptoms are difficulty in word retrieval and speech production, and impairment of visuospatial skills [4]. Mood and behavioral disturbances are also relatively common and include apathy, bradypsychia and bradyphrenia [4, 5].

Cognitive symptoms are caused by dysfunction of frontal-subcortical pathways [6]. In an initial stage, deficits are observed in functions supported by frontostriatal circuits, namely in the executive domain and working memory [4, 11, 12]. Early shunting may delay disease progression and improve quality of life [3]. Reported success rates range between 60 and 80%; hence the importance of adequately assessing and selecting patients [3, 13].

Since a specific cognitive profile in NPH patients has not been identified [14], using standardized protocols to assess cognitive functions is of great importance to ensure proper diagnosis, especially as NPH is thought to be underdiagnosed and increased awareness of its clinical manifestations is necessary [15]. In this study, we assessed changes in neuropsychological evaluations before and after a lumbar tap test in patients with NPH. This study also compares the cognitive profiles of patients with NPH)meeting criteria for a dementia diagnosis (D +) and those without criteria for dementia (D−) at both baseline and after the tap-test intervention, providing insights into the cognitive differences between these two patient cohorts.

2 Methods

2.1 Design and participants

This is a quasi-experimental design. A before-and-after study based on a specialized clinical registry of consecutive patients with NPH diagnosis that presented to the NPH Clinical Care Program of a tertiary care university hospital in Bogotá, Colombia, between May 2016 and October 2020. The study site is a clinical center certified by Joint Commission International since 2017. Patients presenting symptoms of the triad, primarily gait disturbance, presented to neurosurgery consultation, where each patient was examined to determine eligibility for entry into the center. Patients enter a diagnostic process at the clinical center if NPH is suspected based on the presence of at least one of the triad symptoms (gait disturbance, urinary incontinence and cognitive impairment) and suggestive findings on neuroimaging studies. Patients were evaluated by a multidisciplinary group of specialists (neurosurgery, neurology, geriatrics, neuropsychology, and occupational, speech and physical therapist) to determine NPH diagnosis. Cognitive assessment was made by a neuropsychology group, trained to apply the evaluation tests. This study included adult patients who presented to the NPH Clinical Care Program and were diagnosed with NPH by consensus of the multidisciplinary group of specialists that assessed the triad of symptoms. Patients with a visual or hearing impairment, or active delirium without a caregiver or informant that could affect cognitive evaluation, were excluded.

Based on the interdisciplinary evaluation involving specialists in neurology, geriatrics, neuropsychology, and neurosurgery, the presence of dementia was established. Because multiple diseases involving cognitive decline and functional loss, like NPH, do not show compromised memory in initial phases, participants included in the dementia (D +) group met the main clinical criteria for the diagnosis of dementia (of any etiology) based on the recommendations of the National Institute on Aging and Alzheimer's Association criteria [16]. These criteria suggest a progressive course, and functional impairments and compromise in two or more cognitive domains independent of memory.

The cognitive profiles of patients who met the criteria for a dementia diagnosis (D +) and those without criteria for dementia (D-) at both baseline and after the tap-test intervention were compared. The D + group was not further subdivided based on the degree of cognitive deterioration (mild, moderate, and severe) to avoid further subdividing the sample and to ensure a substantial group for comparison with the non-dementia group (D−).

We used clinical records of the neuropsychological evaluations of attention, memory, praxis and executive function. Analogous tests were administrated for each cognitive domain to mitigate the impact of learning effect on assessing changes before and after a tap test. Cognitive assessment was made by neuropsychologists who were trained to apply evaluation tests in this specific clinical population. All consultations before tap test were conducted in the morning. The neuropsychology evaluation lasted 60 min; the other consultations lasted 30 min each. Subsequently, a lumbar puncture was performed. The lumbar tap test involved removing CSF until a closing pressure of 0 cm H2O was achieved; at our NPH Clinical Care Program, we guide CSF extraction not by a standard quantity but by that needed to decrease CSF pressure to a closing pressure of 0 cmH2O, as symptom improvement after the tap test is believed to occur due to the decrease in pressure, not directly as a result of removing a certain quantity of fluid [17]. The evaluations after tap test were conducted the following afternoon. The time interval between neuropsychological evaluations was 24 hours. This study was approved by the institutional ethics review board prior to the start of the study.

2.2 Measurements

Demographic and clinical characteristics including age, sex, level of education, triad symptoms, history of vascular disease, hypertension, diabetes mellitus, and dementia were collected.

2.2.1 Neuropsychological tests

Validated neuropsychological tests of memory, praxis, verbal fluency and executive function were used to assess cognitive performance before and after the tap-test.

Memory: Explicit verbal memory was evaluated through both the Alzheimer’s Disease Assessment Scale–Cognitive Subscale (ADAS-COG) and the Consortium to Establish a Registry for Alzheimer's Disease Word List Memory Task (CERAD) [11, 18,19,20]. Immediate recall of three learning trials gives information about short term memory (STM), and a delayed recall trial gives information about long term memory (LTM). These scales were validated for the Colombian population by Romero-Vanegas (2014) and Aguirre-Acevedo et al. [21, 22]. The selection of two memory tests had the aim of using one memory test before the tap test and another memory test after the tap test to avoid learning effects in the post tap test evaluation. Moreover, the order of the tests alternated randomly.

Visuoconstructional praxis: We used the Rey-Osterrieth Complex Figure Copy (ROCF) [14]. This test is validated for the Colombian population by Hernández et al.[15]

Verbal fluency: We used categories such as “animals or fruits” to assess semantic verbal fluency (SVF) [23]. Phonological verbal fluency (PVF) was assessed with words starting with letter “P or M” [23]. Both have been validated for the Colombian population [15].

Executive function: We used the INECO Frontal Screening (IFS) [24].

2.3 Statistical analysis

A sample size of at least 50 patients was determined to detect small changes in continuous variables (scores) between dependent/paired samples, but it was planned to include all patients from the described period that fulfilled selection criteria to further increase power and precision. Extreme and odd values were verified with source documents. Statistical analyses were conducted using alpha = 0.05 and beta = 0.2 in RStudio software v. 1.3.10.

Descriptive analysis was used to summarize clinical features, level of education and results of neuropsychological assessments. The Shapiro–Wilk test was used to assess the distribution of continuous variables, which determined the use of mean and standard deviation (SD) or median and interquartile range (IQR).

Because the results from different neuropsychological tests have variable interpretations depending on the level of education, population mean values were used to calculate z-scores for all tests and each patient according to the level of education. Z-score units describe standard deviations away from the mean, and thus served to determine “altered” or “not altered” cognitive performance based on z-scores < -1 or > 1, depending on the test (1 SD away from the population mean, considering age and education) [15, 21, 22, 24]. An inversion of the z-scores was used for memory tests that score performance based on insufficiency instead of proficiency (ADAS-COG in short and long term memory) to allow for comparisons before and after the tap test. Differences in z-scores before and after the tap test were assessed with the Wilcoxon signed-rank test (paired, non-parametric test) or the paired t-test (parametric) according to data distribution.

Differences in proportion of patients with deficient performance according to z-score before and after the tap test were assessed for each neuropsychological test with the McNemar’s test (paired “change” test for dichotomous data). Direct comparison of test scores was only adequate for the ROCF and IFS tests, and differences in these scores were also assessed with the Wilcoxon signed-rank test or the paired t-test according to data distribution. Boxplots were used to display distribution and change in individual z-scores for verbal fluency, praxis, executive function and memory tests.

3 Results

Overall, 76 patients fulfilled selection criteria and were included for analysis. Median age was 81 years (IQR: 77–83) and most patients were men (69.7%). A high level of education (> 12 years of education) was reported by 61.8% of patients. Additionally, data on hypertension, type II diabetes and dementia are reported on Table 1; no other data on comorbidities or medications were collected. Baseline characteristics and classic (triad) symptom improvement after tap test are displayed in Table 1.

Table 1 Demographic and clinical data

3.1 Changes in cognition before and after tap test

A total of 65 patients (87.8%) were considered to have improved cognitive performance after the tap test, based on cognitive tests applied: ADAS-COG/CERAD, SVF, PVF, ROCF, IFS. Table 2 shows the mean scores and standard deviation in patients before and after tap test.

Table 2 Mean scores and standard deviation before and after tap test

Absolute scores and corresponding z-scores of neuropsychological tests are summarized in Table 3. Significant differences in cognitive performance pre- and post-tap test were found regarding the ROCF and the IFS. The distribution of the direct scores and their change after tap test are displayed in Fig. 1.

Table 3 Direct scores and z-scores before and after tap test
Fig. 1
figure 1

Z-score Boxplots. Before a and after b tap test z-scores for tests of phonological verbal fluency (PVF), semantic verbal fluency (SVF), ROCF (praxis), INECO frontal screenings (IFS, executive function), short-term memory (STM) and long-term memory (LTM). Differences between the two moments were assessed with the Wilcoxon signed rank test (paired) or the paired t-test, as suited according to value distribution. p-value < 0.05 indicates a statistically significant difference

For some patients, not all tests were performed. Thus, the proportion of altered performance was calculated based on available data. Missing data occurred for 12 patients in PVF, 20 patients in SVF, 5 patients in ROCF, 6 patients in IFS, 8 patients in LTM and 23 patients in STM. (LTM and STM are variables determined in the ADAS-COG/CERAD tests).

At baseline, altered performance in verbal fluency occurred in 41/64 patients (69.5%) for PVF and in 47/56 patients (83.9%) for SVF. 42/66 patients (63.6%) and 28/68 patients (41.2%) had altered performance in STM and LTM, respectively; 37/71 patients (52.1%) had altered performance in ROCF assessment, as well as 66/70 patients (94.3%) in the executive functions screening test (IFS). Table 4 displays results of hypothesis testing for change of altered performance.

Table 4 Number and proportion of patients with altered performance before and after tap test

3.2 Comparisons in clinical and cognitive improvement after spinal tap discriminating by diagnosis of dementia

Table 5 shows the mean scores and standard deviation in patients with (D +) and without (D-) a previous history of dementia. Interestingly, the mean for D + patients was slightly higher than that of D- patients in all tests except for the ROCF after tap test. The standard deviation (SD) for the scores of D + patients was also higher in all cases, except for STM, where the SD is the same for D + and D− patients. When comparing mean scores before and after tap test, the mean after tap test is higher for all variables, except for SVF, in which the mean for D− patients was lower after tap test and the mean for D + patients did not vary, and STM in D + patients, which also did not vary after the tap test. The SD after tap test was higher than that previous to tap test in almost every variable, the exception being SVF and ROCF, in which the SD decreased after tap test for both D + and D− patients.

Table 5 Mean scores and standard deviation before and after tap test in patients with and without a previous history of dementia

Table 6 displays the comparisons in test performance between patients with a positive history of dementia and a negative history of dementia. No significant differences were found in the proportion of improvement observed in the symptoms of the clinical triad after the spinal tap between D + patients and D− patients. Gait improvement was documented in 94.2% of D + and 87.5% of D− (p value = 0.37, from Fisher test); improvement of cognitive impairment in 88.5% of D + and 86.4% of D− patients (p value > 0.99, from Fisher test); and improvement of urinary incontinence in 46.9% of D + and 58.8% of D− patients (p value = 0.57, from Fisher test). These findings suggested that the effect of the spinal tap on gait, incontinence and cognitive performance could be independent of the presence of concomitant dementia; although the sample size for comparisons was small and thus underpowered to detect small differences.

Table 6 Comparisons by History of Dementia

The performance in cognitive tests at baseline (before the tap test) showed differences between D- and D + in the proportion of patients that had an altered performance in PVF (PVF = 50% vs 79.5%, p-value = 0.042), in the ROCF total score (ROCF = 20.8% vs 68.1%, p-value < 0.001), and in the STM (STM = 45.5% vs 72.7%, p-value = 0.057), in which the D + group showed a higher proportion of patients with altered performance. After the spinal tap, significant differences between D + and D- were observed in the proportion of patients that had a deficit score in verbal fluency (PVF = 34.8% vs 73.2%, p value = 0.006; SVF = 57.1% vs 92.1%, p-value = 0.004), in the ROCF total score (ROCF = 25% vs 57.4%, p-value = 0.019), and in executive functions (IFS = 75% vs 97.8%, p-value = 0.009), in which the D + group showed a higher proportion of patients with altered performance.

Finally, the proportion of improvement in z-scores after the spinal tap was significantly different between groups regarding the SVF and IFS. The D- group showed a significantly higher proportion of improvement than the D + group in the test of semantic verbal fluency (SVF = 28.6% vs 2.9%, p-value = 0.003) and in executive functions (IFS = 16.7% vs 0%, p-value = 0.014). These results suggest the D + group exhibited a higher proportion of deficit in cognitive tests at both pre-tap test and post-tap test evaluation. The D- group showed a higher proportion of improvement after the tap test in executive function and verbal fluency test.

4 Discussion

This study, which aimed to describe and assess the cognitive profile of NPH patients at baseline and after a lumbar tap test, found impairments in all cognitive processes and a significant improvement in visuoconstructional praxis and executive functions during diagnostic process.

Cognitive impairments at baseline included a wide spectrum of disturbances, mainly in the executive function domain, followed by semantic and phonological verbal fluency, visuo-constructional praxis, attentional abilities, verbal explicit learning, and memory. Similar findings have been reported previously, including impairments in short term memory, bradypsychia and difficulties in the ability to find and pronounce words correctly [5], attention, executive functions, verbal fluency, memory [25], as well as visuo-constructional praxis [26]. To highlight, in our study, memory disturbances in the free recovery processes were notable. This ability, besides being mediated by both the information storage and recovery processes, is also affected by the executive functions domain because it depends on the capacity of retrieving stored information, usually affected in clinical conditions compromising subcortical structures. In accordance with our findings, some authors have postulated that memory deterioration is associated with the subcortical profile that characterizes this disease [4, 25]. Likewise, it has been proposed that memory impairments may result from decrease in the functional integration of the frontal lobes [27].

Although cognitive impairments at baseline can offer valuable information regarding the etiology of the symptoms, this is not enough to make the diagnosis of NPH, nor to recommend CSF shunt surgery [28,29,30]. Symptom improvement, or lack there-of, after a lumbar tap test provides valuable information needed to confirm this diagnosis and determine a patient’s prognosis regarding shunt responsiveness. If a patient’s symptoms improve after the tap test, this improvement will be expected as well after shunt surgery [1, 28].

The improvement observed in the cognitive profile after the tap test in the executive functions in our study is consistent with a previous study reporting improvement of inhibitory control after a tap test [31, 32]. Improvement in verbal memory and mental speed [31], and in processes related to information search and access, such as those involved in the verbal fluency task [33], has also been reported. Patients in our study did not display verbal memory and verbal fluency changes.

Some of the discrepancies between our results and the findings in other published studies may be explained by several factors. We chose these factors because they are those that have been discussed in much of the research that has evaluated changes in the tap test. [1, 4, 31, 32] One is the time of clinical assessment, described in some studies at 2–6 h, [31, 33] to 24 h,[1, 4, 32] and up to 1 week [1, 32] after the lumbar puncture. In our study, the clinical assessment was performed 24 h after the tap test. Nevertheless, no optimal time for assessment has been established in the literature [1]. Other factors like the quantity of CSF removed, or the setting characteristics could explain the differences between our results and those reported in previous studies. However, no significant differences in clinical outcomes have been reported either in the revised scientific literature when 30-50 ml are removed [1]; therefore, such differences are unlikely to be influenced by variations in volume of cerebrospinal fluid removed. Likewise, the hospital setting could produce fatigue in the patient and skew the results [31]; however, the revised studies were also performed in this setting. Thus, it is unlikely that such factors explain the discrepancies observed. Better explanations are the use of different time intervals between evaluations and differences in sample size that allow higher statistical power to detect changes.

On the other hand, we did not evaluate the association between neuroimaging markers and cognitive profiles in our sample, but the alterations in multiple cognitive processes suggest a widespread alteration of cortical and subcortical structures. In fact, differences in the level of impairment in patients with NPH are thought to depend on the extent of brain injury [34]. Thus, executive dysfunction or a global cognitive compromise could be generated by disruptions in subcortical areas connecting with the frontal lobe cortex disturbance [34]. In addition, other structural alterations such as extension of the anterior horn of the lateral ventricles, accompanied by a significant compression of the brain capillaries due to the increased hydrostatic tissue fluid pressure of the parenchyma [3], frontal and anterior area of corpus callosum dysfunction [3], axonal injury, brain white matter ischemic demyelination, and microinfarcts have also been associated with cognitive impairment in NPH patients [13].

The level of improvement after the lumbar tap test both in magnitude and domain aspects is highly dependent on clinical variables, like alcohol consumption and comorbidities such as vascular risk factors and dementia like Alzheimer’s Disease (AD) [35,36,37,38,39]. Indeed, most of the patients in our cohort had pre-existent dementia diagnosis (68.42%) which may have negatively influenced the improvement after the lumbar tap test. Other variables such as cognitive reserve, have been associated with better co** with age-related cognitive decline and the negative consequences of brain pathology [40]. Although we considered the educational level in our sample, other sociobehavioral indicators of cognitive reserve.

(occupational achievements and leisure activities) were not included but could represent a topic for future research.

Numerous studies have found common biomarkers in both AD and NPH patients [40, 41]. Our study does not discriminate between the multiple possible etiologies of D + patients, but in the case of common biomarkers, subsequent studies could discriminate between the clinical profiles of patients with AD and patients with NPH in order to clarify the diagnoses and improve the expectations of families and patients regarding their improvement.

In examining the cognitive performance after the tap test in patients with HPN, our study brings forth valuable insights into the discrepancy between individuals meeting criteria for dementia (D +) compared to those without dementia criteria (D−). Our analysis revealed a nuanced perspective on the cognitive changes associated with lumbar puncture, emphasizing the need to consider the baseline cognitive differences within these subgroups. Notably, our findings demonstrate divergent patterns in the proportion of improvement in z-scores post-spinal tap, with specific emphasis on semantic verbal fluency (SVF) and executive functions (IFS). The D− group exhibited a significantly higher proportion of improvement in these cognitive domains compared to the D + group, underscoring the distinct cognitive responses to the tap test within each cohort. These observations highlight the necessity for tailored interventions that recognize and address the unique cognitive profiles present in different subgroups of NPH patients. As D + patients did not seem to improve substantially in semantic fluency, thus showing a differential cognitive profile between D + and D− patients in this cognitive domain, it must be noticed that impairments in semantic access have been closely related to AD [42,43,44]; therefore, this finding is promising as a relevant clinical element in the diagnostic process of suspected NPH patients. Moreover, in contrast to AD, memory impairment is not the dominant manifestation in NPH [45]. These findings support the specificity of cognitive impairment in patients with NPH as a single clinical entity, predominantly affecting executive functions.

Our study provides significant results in the clinical and research field as it underlines the importance of semantic fluency and executive functioning tests in the differential diagnosis of normal pressure hydrocephalus in tap test protocols. In fact, an improvement in semantic fluency tests could be a significant clinical variable when discriminating between D + and D−. Moreover, our study provides valuable information comparing the cognitive profiles of two groups: patients with NPH meeting the criteria for a dementia diagnosis (D +) and those without criteria for dementia (D−) at both baseline and after the tap-test intervention, providing insights into the cognitive differences between these two patient cohorts.

A strength of our study is that NPH diagnosis was achieved through an interdisciplinary evaluation by neurosurgery, physiatry, neuropsychology, and physical, occupational and speech therapy. Even though the literature on NPH highlights the impairment in executive functions and attention, we consider that it is highly relevant to evaluate several cognitive domains in order to better characterize the baseline and after-tap test cognitive profiles. The cognitive tests that we propose have some advantages: they are not time-consuming and they include tasks that evaluate different levels of cognitive ability, which is important when examining patients with dementia or low education. Moreover, we carefully select the memory tests, searching for a similar structure, allowing us to avoid learning effects in the post-tap test evaluation.

One of the limitations of this study is that a direct measurement of attentional processes was not available. Additionally, the type of dementia could not be determined in cases of comorbidity and we do not have a cognitive follow-up after CSF shunt. Further studies should include a longitudinal follow-up and should establish the etiology of dementia. This would help to precisely determine the improvement after CSF shunt procedures in patients with mixed dementia or other comorbidities.

We acknowledge the limitation regarding the lack of subgroup analysis based on the degree of cognitive deterioration in the dementia group. The decision not to stratify the dementia group into mild, moderate, and severe categories was made due to the current constraints in sample size. We believe that maintaining a more consolidated group (D +) facilitates meaningful comparisons with the non-dementia group (D−). We recognize the importance of future studies with larger sample sizes that can explore nuanced variations in cognitive impairment severity and its implications.

5 Conclusions

In conclusion, the examination of cognitive processes before and after the tap test in NPH patients reveals a significant improvement in executive processes and constructional praxis. Furthermore, comparing cognitive alterations between D + and D− patients, as expected, worse performance was observed before the tap test in most of D + patients, especially in IFS, PVF, SVF, and ROCF. While an improvement in symptoms of the clinical triad was observed in both D + and D− groups, a significant improvement in SVF and IFS was observed exclusively in D- patients.

These findings underscore the cognitive impairment that occurs in NPH patients, the improvement that can be achieved with the tap test, and therefore its importance during the diagnosis and prognosis of the disease. We hope our findings will help improve this diagnostic process by aiding clinicians in knowing what changes to look for in the cognitive symptoms after the tap test, through a more formal evaluation of these impairments.

Determining the cognitive profile in NPH is essential, not only to contribute with the diagnosis of the disease, but also to identify the level of cognitive impairment that is present; if it is compatible with dementia or a mild cognitive disorder [25]. Additionally, looking for the etiology of cognitive symptoms could allow to tailor the expectations around the cognitive improvement from a CSF shunt, because persistence or reappearance of this kind of symptoms is observed in patients with multidimensional etiologies dementia (NPH plus another cause) [46].

Finally, we want to highlight the importance of undertaking the assessment of NPH patients through an interdisciplinary approach, in which the diagnostic role of the tap test in combination with a structured cognitive evaluation constitutes a valuable tool. We call attention to the importance of determining the cognitive profile of the patient to understand the changes in their daily activities, offer them a conducive environment in their home and facilitate the management of the expectations of patients and their families, which will finally produce a positive impact on the patients and caregivers.

5.1 What is already known on this topic

Cognitive impairment in NPH patients can affect different processes (attention and executive process, psychomotor speed, visuoconstructional skills), and significant improvement in various domains has been reported after lumbar punction. However, there is no consensus about the cognitive profile of NPH patients, likewise further studies are needed to determine which cognitive processes are most likely to improve after shunt.

5.2 What this study adds

This study describes and assesses the cognitive profile of NPH patients at baseline and after the tap test. We found impairments in all cognitive processes, mainly in the executive function domain; we also found a significant improvement in visuoconstructional praxis and executive functions, as well as differential profiles in patients with preexisting dementia, especially in semantic information access and memory impairment.

5.3 How this study might affect research, practice or policy

The findings of this study highlight clinical elements that can contribute to the diagnosis of NPH and suggest a standardized protocol that will ensure a proper diagnosis for these patients.