Introduction

Spondylolisthesis is the slippage of one vertebral body over the subjacent one due to several mechanisms [1]. The etiology is mostly the deficiency of facet joints in patients with degenerative spondylolisthesis (DS), and the defect of pars interarticularis in patients with isthmic spondylolisthesis (IS) [2]. Congenital abnormalities of the upper sacrum or the arch of L5, fractures, and bony disorders could cause dysplastic, traumatic, and pathologic spondylolisthesis, respectively [3].

Conservative management including pain relief, physical therapy, and exercise is the choice of treatment in patients with stable low-grade spondylolisthesis without neurogenic claudication or radiculopathy [4,5,6,7]. It has been reported that comprehensive conservative management including patient education, pain control with medications, transforaminal epidural injections followed by a 6–8-week exercise program decreased the need for surgery in DS and IS [8, 9]. Surgery is required in case of persistent or recurrent low back pain (LBP) and/or leg pain, significant impairment in quality of life, progressive neurological deficits, and bladder/bowel symptoms despite a reasonable trial of conservative management [10]. Arthrodesis is the most preferred surgical approach in patients with spondylolisthesis [1].

Conservative management should focus on improving paraspinal muscles since these muscles are the main stabilizers of the lumbar spine. It has been shown that the thoracic fibers of the lumbar erector spinae contributed between 70% and 86% of the total extensor moment exerted on the upper lumbar spine [11]. Thoracic fibers of the erector spinae becomes aponeurosis at the lumbar spine [12]. The lumbar fibers of the erector spinae and the multifidus contributed to 30% and 20% of the total extensor moment exerted on lower lumbar spine, respectively [11]. It has been depicted that the psoas exerts very small moments to extend the upper lumbar spine and to flex the lower lumbar spine [13]. A recent study reported that women with chronic LBP had less fat-infiltrated psoas to compensate fatty multifidus at L4-L5 disc level [14]. They proposed that the psoas muscle acted like a rod placed in a cylinder that could resist instability in all directions. Paraspinal muscles (multifidus, erector spinae and psoas) have been studied in respect of their roles in patients with IS and DS. Wang et al. [15] reported that relative cross-sectional area (CSA) of the paraspinal muscles were smaller patients with DS. Even patients with different type of spondylolisthesis (IS vs. DS) had different CSA and fatty infiltration patterns in their paraspinal muscles. Li et al. [16] depicted that patient with IS and control subjects had bigger erector spinae muscles than those with DS. Cao et al. [17] suggested that fatty infiltration in the psoas was an independent factor related to DS in asymptomatic adults. However, they evaluated paraspinal muscles at only L4-L5 level [15,16,17]. A recent study showed multifidus degeneration was more significant in patients with lumbar DS compared to those with degenerative lumbar kyphosis [18]. However, patients who received conservative management and those who underwent surgery for their lumbar spondylolisthesis have not been compared in terms of their paraspinal muscles at all lumbar levels.

In this study we aimed to identify whether paraspinal muscle morphology could play a role in surgical decision-making in patients with lumbar spondylolisthesis. Our aim was to help clinicians and surgeons to distinguish the surgical candidates with lumbar spondylolisthesis from those who could get better with conservative management more easily.

Methods

Patient cohort

We conducted a cross-sectional analysis of a prospectively collected database between January 2013 and May 2023. Consecutive women and men, who visited our outpatient clinics with chronic LBP, neurogenic claudication, and had lumbar spine magnetic resonance imaging (MRI) for their complaints lasted for > 6 weeks despite conservative management were included into the preliminary dataset. We recorded patients who underwent surgery for lumbar spondylolisthesis at L4-L5 (due to DS) or L5-S1 (due to IS) level following two consecutive unsuccessful standardized trials of conservative management in our clinic. Then, we excluded those subjects who underwent surgery for moderate/severe spinal stenosis and/or neurological deficits. Moderate and severe spinal stenosis were characterized by aggregation of cauda equina strands and appearance of the cauda equina as a bundle of strands, respectively [19]. Patients who received conservative management including medication, physical rehabilitation, and intervention for lumbar spondylolisthesis at either L4-L5 or L5-S1 level were recruited as conservative group. Patients who underwent surgery for spondylolisthesis (surgical group, n:32) were age-, sex-, subtype- (10 isthmic cases, 22 degenerative cases), level (L4-L5 for DS and L5-S1 for IS cases), and grade (Meyerding grade 1 or 2)-matched with those who received conservative management (conservative group, n:32). Institutional review board approved this study (IRB no: FSMEAH-KAEK 2023/33). All study protocol was compliant with Declaration of Helsinki and its later amendments. Patients who met the following criteria were excluded: inability to obtain MRI, trauma, history of previous spine surgery, spinal infection, scoliosis, lumbarization/sacralization, kyphosis, neurological or psychiatric disorders, endocrine or rheumatic diseases, malignancy, and pregnancy.

Imaging modality

Patients were evaluated on lumbar spine MRIs (1.5 Tesla, Philips, Amsterdam, The Netherlands) using Picture Archiving and Communication System (PACS). Images were obtained in sagittal T1-, sagittal Turbo Spin Echo T2-, sagittal fat-saturated T2-, axial Turbo Spin Echo T2-, and coronal Turbo Spin Echo T2-weighted sequences using dStream TotalSpine coil (Philips, Amsterdam, The Netherlands). The imaging parameters were: echo time, 14 ms and 100 ms; repetition time, 440 ms and 3222 ms for T1- and T2-weighted sequences, respectively. The field of view (FOV) was 160 mm (anteroposterior [AP]) x 270 mm (feet-head [FH]) x 66 mm (right-left [RL]); voxel size was 1.1 mm (AP) x 1.5 mm (FH) x 4 mm (RL); matrix size was 144 (AP) x 168 (FH) x 15 (RL), and slices had a gap of 0.4 mm for all MRI sequences and all MRI orientations.

Evaluation of the spine, paraspinal muscles, and spinopelvic parameters

Intervertebral discs and vertebral endplates were evaluated by one author (UOÖ). Lumbar intervertebral disc degeneration (IVDD) was graded from L1-L2 to L5-S1 disc levels using Pfirrmann grading system on T2-weighted sagittal lumbar spine MRIs [20]. Vertebral endplates were assessed from L1-L2 to L5-S1 disc levels using Modic classification on T1- and T2-weighted sagittal lumbar spine MRIs [21, 22]. Intervertebral discs with Pfirrman grades I-III IVDD were recorded as ‘mild-to-moderate IVDD’ whereas those with Pfirrman grade IV-V were categorized as ‘severe IVDD’ [23]. Modic changes were recorded as ‘present/absent’ to ease the statistical analyses.

One author (FT) evaluated multifidus, erector spinae, and psoas muscles from L1-L2 to L5-S1 intervertebral disc levels in terms of fatty infiltration using Goutallier classification system on T2-weighted axial lumbar spine MRIs (Fig. 1) [24].

Fig. 1
figure 1

Assessment of fatty infiltration in the paraspinal muscles in a patient with lumbar spondylolisthesis who received conservative management. L1-L2, L2-L3, and L5-S1 disc levels: Goutallier scores of the multifidus, erector spiane and psoas muscles were 2, 2, and 1, respectively. L3-L4 and L4-L5 disc levels: Goutallier scores of the multifidus, erector spinae and psoas muscles were 3, 2, and 2, respectively

Pre-operative spinopelvic parameters (lumbar lordosis [LL] and pelvic incidence [PI]) were measured by one author (CSY) on lumbar spine computed tomography scans obtained when the patients were in the supine position with straightened lower limbs, as mentioned formerly [25].

Statistical analysis

Data were analyzed using statistical package for social sciences (SPSS) version 20.0 (IBM, Armonk, New York, USA). Categorical variables were presented as absolute numbers and percentages, while continuous variables were presented as mean values with standard deviations. Normal distribution of the data was analyzed via Kolmogorov-Smirnov and Shapiro-Wilk tests. The data was not normally distributed, accordingly. Non-parametric continuous variables were compared using Mann-Whitney U test. Categorical variables were compared either with Chi-square test or Fisher’s exact test. Binary logistic regression analysis was conducted to depict predictors of surgery for patients with lumbar spondylolisthesis for the variables with significant differences between the groups. Results of the regression analysis was presented as odds ratio (OR) with 95% confidence interval (CI) values. Receiver operating characteristic (ROC) analysis was done to define cut-off value for any predictor(s) of surgery for lumbar spondylolisthesis. Area under curve (AUC) was presented with 95% CI. Since the subjects were radiologically evaluated by the same author, only intra-rater reliability values were calculated for IVDD, Modic changes, fatty infiltration in paraspinal muscles and spinopelvic parameters. For this purpose, radiological measurements were repeated 1 month after the first measurements in 20 randomly selected patients (10 patients from surgical group and 10 patients from non-surgical group). Intra-rater reliability values for IVDD, Modic changes, fatty infiltration in paraspinal muscles were calculated using the method described by Landis and Koch [26]. An alpha value of p < 0.05 was accepted as statistically significant.

Results

Demographics

We had 64 patients (women: 84.4% [n: 54], men: 15.6% [n: 10]; age: 55.00 ± 12.3 years [range: 23–81 years]). Each group consisted of 27 women (84.4%) and 5 men (15.6%). Surgical and conservative groups were similar in terms of age (54.84 ± 9.71 years vs. 55.15 ± 14.65 years, p = 0.856), lumbar lordosis (41.60 ± 13.15o, vs. 41.22 ± 15.91o, p = 0.742) and pelvic incidence (58.54 ± 9.16o vs. 59.76 ± 12.24o, p = 0.985). There were 10 patients with IS presented at L5-S1 level and 22 patients with DS presented at L4-L5 level in each group.

Intra-rater reliability tests

Intra-rater reliability values for the evaluation of IVDD, Modic changes, fatty infiltration in the paraspinal muscles and spinopelvic parameters were 0.928 (95% CI: 0.867–0.960, p < 0.001), 0.918 (95% CI: 0.857–0.954, p < 0.001), 0.737 (95% CI: 0.669–0.790, p < 0.001), and 0.857 (p < 0.01), respectively.

Comparison of the groups

Patients who underwent surgery for spondylolisthesis (surgical group: 32) and those who received conservative management (conservative group: 32) had no significant differences in terms of having severe IVDD and Modic changes at any lumbar level (Table 1). The surgical group had significantly fattier erector spinae compared to the conservative group (Table 2). Regression analysis revealed an OR of 1.088 (95% CI = 1.010–1.173, p = 0.026) for fatty infiltration in the erector spinae to predict which patient could undergo surgery for lumbar spondylolisthesis. ROC analysis depicted a cut-off value of 17 points (AUC: 0.652, 95% CI: 0.517–0.786, p = 0.037) for fatty infiltration in the erector spinae to predict which patient would undergo surgery for spondylolisthesis. The sensitivity and specificity for the new cut-off values were 69% and 50%, respectively (Fig. 2).

Table 1 Comparison of the conservative group with the surgical group in terms of severe intervertebral disc degeneration and presence of Modic changes
Table 2 Comparison of the conservative group with the surgical group in terms of spine degeneration and fatty infiltration in paraspinal muscles
Fig. 2
figure 2

Receiver operating characteristic (ROC) curve of fatty infiltration in the erector spinae predicting surgical option for lumbar spondylolisthesis

Discussion

Background knowledge

Lumbar DS could occur due to IVDD, facet joint degeneration, ligamentous laxity, and paraspinal muscle atrophy/fatty infiltration [27]. Lumbar IS results from an abnormality of the pars interarticularis [28]. Even though many factors have been proposed for the etiology of DS and IS, the exact triggers for the onset and progression of spondylolisthesis are still unclear [29,30,31]. Lumbar DS could cause low back pain (LBP) and leg symptoms (neurogenic claudication) due to the slippage and concomitant lumbar spinal stenosis, respectively [27]. Lumbar IS is usually asymptomatic, and mostly diagnosed incidentally on plain radiographs [32]. It could also present with LBP and/or leg pain, and neurologic deficits at or below the level of the injury following repetitive hyperextension or rotation of the lumbar spine [28, 33].

The Meyerding grading system is still considered as the gold standard classification for spondylolisthesis [34]. However, it does not include clinical (pain, functionality, quality of life) and radiological (spino-pelvic alignment) parameters. Wiltse, Newman, Macnab, Marchetti-Bartolozzi, Mac-Thiong and Labelle, the Spinal Deformity Study, French Society for Spine Surgery, clinical and radiographic degenerative spondylolisthesis (CARDS), and Gille classification systems have been proposed to improve grading in patients with spondylolisthesis [3, 35,36,37,38,39,40,41]. Even though these classifications improved the categorization of the patients with lumbar spondylolisthesis, the predictive factors for surgery are still unclear.

Does radiology correlate with clinical findings?

The correlation of DS progression and intensity of pain is controversial [31]. Sinha and George [42] observed that even severe radiologic foraminal stenosis without radicular pain did not seem to push patients to undergo surgery. A recent meta-analysis depicted that not all patients with DS end up with significant pain/disability and eventually require surgery [31]. Similar facts are also valid for IS. Beutler et al. [43] screened 500 elementary school subjects between 1954 and 1957 in terms of spondylolysis and IS. They found 30 subjects with spondylolysis and IS and prospectively followed them for the next 45 consecutive years. Those subjects did not report any difference in pain, disability, or quality of life compared to general population during the follow-up. Andrade et al. [44] conducted an epidemiologic systematic review of 15 published observational studies analyzing any association between IS and clinical findings. Their findings did not suggest a strong association between those two entities.

In the present study, we aimed to identify whether we could predict the surgical candidates for lumbar spondylolisthesis (either due to DS or IS) by evaluating their lumbar spine intervertebral discs, end-plates, and paraspinal muscles. Our aim was to help clinicians and surgeons more easily to distinguish the surgical candidates with lumbar spondylolisthesis from those who could get better with conservative management.

Confounders

Progression of spondylolisthesis is associated with angulation of the disc, increased loading across the disc space, lower intercristal line, increased pelvic incidence, and joint hyperlaxity [45,46,47]. Formation of spurs, loss of disc height, and ossification of the ligaments are compensatory changes in those subjects [45]. However, these compensatory changes could cause spinal stenosis and limited range of motion [48, 49]. To overcome such confounders in the present study, we excluded the patients with moderate/severe spinal stenosis based on the classification of Lee et al. [19] and those with neurological deficits.

A recent meta-analysis reported that demographic and radiological factors including female sex, body mass index, menopause, early IVDD, sagittal facet joint orientation, joint laxity, high pelvic incidence, presence of spondylolisthesis at L4-L5 level, increased L4 or L5 vertebral angle, increased lumbar I axis sacral I distance, increased lumbar lordotic angle, and having more than 25% slippage were associated with the symptoms of DS [31, 45,46,47, 49,50,51,52,53,54,55]. It has been shown that age, pelvic incidence, facet joint angle, and pedicle facet angle were associated with IS [29]. To overcome those confounders, our patients were matched in terms of age, sex, lumbar lordosis, pelvic incidence, subtype (DS or IS) and Meyerding grade of their spondylolisthesis.

Significance of the current findings

The main role of paraspinal muscles is to maintain the upright posture and dynamic stability of the spine [56]. Our results showed that the fattier erector spinae the patient had, the more likely he/she could undergo surgery. Each 1-point increment in fatty infiltration in the erector spinae at any lumbar level increased the likelihood of surgery by 8%.

Wang et al. [57] compared age- and sex-matched patients with DS and patients with LBP without spondylolisthesis. They found that patients with DS had loss of anterior disc height, atrophied multifidus, and hypertrophied erector spinae muscles indicating the compensatory mechanisms. However, they did not compare patients who underwent surgery and who did not.

Thakar et al. [58] compared 120 patients with IS with age- and sex-matched normal population. They depicted those patients with IS had selective atrophy of the multifidus and compensatory hypertrophy of the erector spinae. Ding et al. [18] reported those patients with DS and degenerative kyphosis had different patterns of paraspinal muscle degeneration. Multifidus muscle degeneration was more significant in patients with DS whereas erector spinae degeneration was more significant in those with degenerative kyphosis. However, they did not analyze the predictor(s) for surgery.

In our study, patients with fattier erector spinae required surgery whereas those with better erector spinae benefited from conservative management. Our findings confirmed the findings of Ding et al. [18], as they concluded that erector spinae tended to maintain the spinal sagittal balance and when erector spinae could not properly contract, subjects with lumbar spondylolisthesis might become surgical candidates. A recent case-control study comparing subjects with LBP and those without LBP reported that fatty infiltration in the erector spinae at upper lumbar spine was the main predictor for LBP [59]. Fatty infiltration impairs the quality of the paraspinal muscles since fat tissue is noncontractile [60,61,62]. Thus, a hypertrophied muscle could also be fatty and a fatty muscle could not stabilize the spine properly. We suggest that fatty and therefore poor erector spinae muscles cannot stabilize the spine properly. Thus, patients require surgery when the fatty infiltration score for the erector spinae is at or above the cut-off value of 17. Despite a close association has been depicted in between IVDD, Modic changes and fatty degeneration of paraspinal muscles; there was no significant difference in IVDD severity or Modic changes between the groups (Tables 1 and 2) [63].

Limitations

The present study inherits the limitations of a cross-sectional study. Thus, we could not present a cause-effect relationship. Clinical outcome measurements of the subjects were lacking due to the retrospective nature of data collection. Our sample size was small to identify sex- or subtype- (IS vs. DS) related differences. We included 10 patients with IS and 22 patients with DS with either grade 1 or 2 spondylolisthesis in both groups. We did not evaluate facet joint orientation of the subjects which could have some impact on the natural history of the disease. Since this study was a retrospective one, full standing spine radiographs were not available in some patients. To homogenize measurement of spinopelvic parameters, we measured sagittal lumbopelvic angles on lumbar spine computed tomography scans in supine position with straightened lower limbs, as previously shown in the literature [25]. To the best of our knowledge, this is the first study to compare patients who underwent surgery and those who received conservative management for lumbar spondylolisthesis in terms of IVDD, Modic changes, and paraspinal muscles at all lumbar levels. We also very firstly reported the possible role of paraspinal muscle quality in predicting surgical candidates with lumbar spondylolisthesis. The present study has its strengths coming from a matched cohort of subjects in terms of age, sex, lumbar lordosis, pelvic incidence, subtype and grade of spondylolisthesis.

Conclusion

Fatty erector spinae could predict the surgical candidates with lumbar spondylolisthesis. Each 1-point increment in fatty infiltration in the erector spinae at any lumbar level increased the likelihood of surgery by 8%. Lumbar spondylolisthesis patients with fatty infiltration score for the erector spinae at or above 17 were more likely to have surgery. We recommend clinicians to focus on improving erector spinae in patients with lumbar spondylolisthesis. Future studies should be done to focus on subgroups of lumbar spondylolisthesis.