Introduction

Low back pain (LBP), which affects a staggering 70–85% of individuals at some point in their lives, presents a significant global public health challenge, resulting in a considerable financial burden on healthcare and social systems [1,2,3]. In general, LBP refers to discomfort, tension or inflexibility that is felt in the region of the body situated beneath the ribcage and above the inferior gluteal folds, often accompanied by leg pain (sciatica) and other neurological issues affecting the lower extremities [1, 4]. While various factors can play a role in the development of LBP, intervertebral disc degeneration (IVDD) stands out as one of the primary causes [4, 5]. IVDD serves as the pathological foundation for various spinal degenerative disorders and is a prevalent orthopedic condition that contributes to a reduced quality of life [6]. The intervertebral disc (IVD) consists of the nucleus pulposus (NP), annulus fibrosus, and the cartilage endplate, which are primarily composed of collagen and proteoglycan, imparting crucial properties to the disc. The NP is a critical component of the IVD, primarily made up of NP cells and the extracellular matrix (ECM). Intervertebral disc degeneration (IVDD) is a prevalent degenerative condition that is distinguished by the gradual reduction of proteoglycans and water content within NP [7]. As the disease progresses, the discs between the vertebrae may break down, rendering them more susceptible to herniation, which can cause compression of the spinal nerves and nerve roots. The irritation of nerves in the lower back as a result of IVDD is known as lumbar radiculopathy. If this occurs in the nerve roots of L4-S2, it commonly results in a distinct type of pain known as sciatica [8, 9].

The gut microbiota (GM) refers to the distinct microbial populations that inhabit the intestinal tract and coexist in a mutually beneficial relationship with the host organism, including bacteria, protozoa, fungi, archaea, and viruses [10]. It has the potential to influence multiple physiological processes, including metabolism, inflammation, and immune responses [11,12,13,14]. The identification of gut microbiota taxonomic characteristics and their potential role is mainly based on the utilization of 16S rRNA and metagenomic sequencing methods, which are commonly employed techniques [15]. In a recent study by Rajasekaran et al. [16], a total of 24 lumbar intervertebral discs (IVDs) were analyzed, revealing that the microbial makeup present in healthy IVDs contrasted with that of degenerated and herniated IVDs. Changes in the composition of the microbiome and the way hosts respond to microbiota, which can cause abnormal bone growth and resorption [17, 18], gave rise to the idea of the gut-bone marrow axis [19, 20] and the gut-bone axis [18]. Subsequent to the study conducted by Rajasekaran et al., a comparable gut-disc axis concept has emerged that could have significant implications in intervertebral disc degeneration and low back pain [16, 21]. As a result, the regulation of gut microbiota could potentially impact the diversity and quantity of microbiota within the intervertebral disc, ultimately hel** to regulate intervertebral disc degeneration.

However, additional investigation is required to further explore the distinct role of various gut microbiota taxa in the development of intervertebral disc degeneration. Akin to randomized controlled trials (RCT), the Mendelian randomization (MR) study is a recent research approach that investigates the causal relationship between exposure and outcome [22]. Mendelian randomization is a genetic epidemiology technique that uses single nucleotide polymorphisms (SNPs) that are known to affect modifiable exposures as instrumental variables (IVs) to deduce the causal effect of an exposure on an outcome. This approach is advantageous because it can eliminate confounding bias and can help to distinguish between the causal pathways of phenotypically grouped risk variables that are difficult to randomize or that are prone to measurement error [23]. In this study, we utilized GWAS summary statistics of GM and IVDD to perform MR analysis, with the aim of identifying GM taxa that may have a significant impact. This approach can help to confirm existing evidence and offer fresh perspectives on the management and prevention of intervertebral disc degeneration.

Materials and methods

Study design

The overall flowchart of this study is shown in Fig. 1. MR studies require three assumptions to be met: (1) a strong correlation between the instrumental variable (IV) and the exposure, (2) IVs are unrelated to confounding factors, and (3) IVs are only related to the outcome through the exposure [24, 25]. Specifically, we determined the gut microbiota taxa that had a causal effect on intervertebral disc degeneration (IVDD) through bidirectional two-sample Mendelian randomization. Our results were reported according to the STROBE-MR guidelines [26]. We utilized GWAS data that had previously been obtained with informed consent and ethical approval for public release.

Fig. 1
figure 1

Overall flow chart of this study

Data sources for exposure and outcome

Based on twin, family, and population-based studies, it is evident that genetic factors also play a role in determining the composition of the gut microbiota, and some bacterial taxa exhibit heritability [27, 28]. The MiBioGen consortium conducted a study analyzing the genotypes of hosts and the sequencing profiles of 16S fecal microbiomes rRNA gene of 18,340 participants, as reported by Kurilshikov et al. [11]. The 18,340 participants in this study were sourced from 24 cohorts spanning the United States, Canada, Israel, South Korea, Germany, Denmark, the Netherlands, Belgium, Sweden, Finland, and the United Kingdom. Microbiome trait loci (mbTL) were identified using standardized methods to pinpoint genetic loci that influence the relative abundance (mbQTLs) or presence (mbBTLs) of specific microbial taxa. Finally, Gene Set Enrichment Analysis (GSEA) and Phenome-wide association studies (PheWAS) were conducted to provide biological interpretations of the results from the genome-wide association study (GWAS). The GWAS study examined 211 GM taxa ranging from genus to phylum level and discovered genetic variants associated with 9 phyla, 16 classes, 20 orders, 35 families, and 131 genera.

We obtained summary statistics of GWAS for IVDD from the FinnGen Consortium R8 release, which included 29,508 cases and 227,388 controls [29]. The FinnGen project was initiated in the autumn of 2017 in Finland, aiming to integrate genomic information with digital healthcare data and involving collaboration from universities, hospitals, THL (National Institute for Health and Welfare), blood service centers, biobanks, FINBB (Finnish Biobanks), international pharmaceutical companies, and hundreds of thousands of Finnish participants. The primary goal of this project is to collect and analyze genomic and health data from 500,000 participants in the Finnish Biobank to enhance human health and discover novel approaches for treating various diseases. The data collected includes individual clinical medical information, genomic data, and environmental/lifestyle factors. To establish an extensive and comprehensive research resource, FinnGen employs various data collection methods, including medical information from Finland’s national healthcare archives and genomic data obtained through large-scale gene sequencing and genoty** techniques. Additionally, the project may also gather disease-related survey questionnaires and biological specimens to obtain more comprehensive data information. The diagnosis of IVDD was based on ICD-10 M51, ICD-9 722, and ICD-8 275, excluded ICD-9 7220|7224|7227|7228A, ICD-8 7250. Table 1 presents detailed information on the exposure and outcome analyzed in this MR study. The details of the exposure and outcome are shown in Table 1.

Table 1 Details of the exposure and outcome

Identification of IVs

SNPs closely associated with each GM taxon were used as instrumental variables (IVs) in this MR study. Due to the limited number of IVs obtained at a strict threshold (P < 5 × 10−8), a more comprehensive threshold (P < 1 × 10−5) was utilized to obtain a relatively higher number of IVs, thus resulting in more robust results [30]. In addition, to ensure the independence of each IV, SNPs within a 10,000 kb window size with a threshold of r2 < 0.001 were pruned to mitigate linkage disequilibrium (LD). Subsequently, we eliminated palindromic SNPs and SNPs that did not appear in the outcome from the IVs. Ultimately, we computed the F statistic for the IVs to evaluate the degree of bias due to weak instruments.The calculation formula is as follows: F = \(\frac{N - K - 1}{K} \times \frac{{R^{2} }}{{1 - { }R^{2} }}\); R2 = 2 × EAF × (1 − EAF) × BETA2 [31]. R2 represents the proportion of variance in the exposure that is explained by genetic variants. N = sample size; K = the number of IV. If the F statistic > 10, weak IVs were deemed not to have caused bias [32].

Statistical methods

For each GM taxon, the inverse variance weighted (IVW) method was used as the primary analysis method to determine causal associations (P < 0.05), with four additional methods (MR-Egger, weighted median, simple mode, and weighted mode) employed as supplementary measures [33, 34]. This study conducted sensitivity analyses in order to eliminate potential bias and examine the robustness of the IVW results. The Cochran Q test was utilized to assess heterogeneity among SNPs, and a P value greater than 0.05 indicates a lower likelihood of heterogeneity among the SNPs, in which case the IVW fixed-effect model was employed for analysis. Conversely, if the P value was less than or equal to 0.05, the IVW random-effects model was used [35].

In IVW regression, the intercept term is not considered and the reciprocal of the outcome variance (se2) is used as weights for fitting [33]. The weighted median method is defined as the median of the weighted empirical density function of the ratio estimates, and causal relationships can be consistently estimated if at least 50% of the information in the analysis comes from valid instruments [33]. In MR-Egger regression, the intercept term is considered, and the reciprocal of the outcome variance (se2) is also used as weights for fitting, with the resulting intercept used to assess horizontal pleiotropy [36]. MR-PRESSO global test was also utilized to achieve the same objective, which eliminated the influence of pleiotropy by removing outliers [37]. In addition, funnel plots and forest plots were constructed to visualize and ensure the reliability of the results. Finally, we converted the effect estimates to odds ratios (ORs) and their corresponding 95% confidence intervals (CIs) to more intuitively display the causal associations between each GATA taxon and outcomes.

A significance level of P < 0.05 indicated the presence of a causal relationship between the exposure and outcome.To account for multiple testing (multiple exposures), the significance of the MR effect estimates was controlled using a Benjamini–Hochberg false discovery rate (FDR) of < 5% at a specific level. Additionally, we performed reverse causal analysis to examine the reverse causality relationship. To meet the core assumptions of MR, the selected SNPs were further filtered in the Phenoscanner database to ensure that the included instrumental variables were not correlated with known confounding factors [38], including dried fruit intake [57]. In the mouse model of IVDD, the abundance of Muribaculaceae and Lactobacillus increased, while the abundance of Clostridia_UCG-014 decreased. Furthermore, fecal microbiota transplantation further increased the abundance of Lactobacillus and reduced the abundance of Clostridia_UCG-014 [58]. Su et al. [59] conducted a Mendelian randomization study on the potential causal effects of specific gut microbiota and gut microbiota metabolites on low back pain (LBP). Similar to our findings, they observed that the abundance of the genus Marvinbryantia is a potential risk factor for LBP, while the abundance of the family Rikenellaceae and family Ruminococcaceae is a potential protective factor for LBP. These studies provide evidence for the association between gut microbiota and intervertebral disc degeneration (IVDD).

The pathogenic mechanisms underlying the role of gut microbiota in IVDD have been widely discussed in previous studies, involving various aspects such as inflammatory response, gut barrier function, and nutrient metabolism. These mechanisms intersect and collectively contribute to the overall impact of gut microbiota in IVDD. The Escherichia Shigella is a group of bacteria capable of producing lipopolysaccharide (LPS). LPS is a glycolipid component found in the outer membrane of gram-negative bacteria [60]. LPS can activate the TLR4/MyD88/NF-κB signaling pathway, leading to the release of pro-inflammatory mediators such as IL-6, IL-1β, and TNF-α. This cascade of events triggers a series of inflammatory processes, ultimately resulting in chronic low-grade inflammation [61]. A significant decrease in the abundance of a short chain fatty acids-producer, Marvinbryantia spp, was observed in the low muscle mass elders [62]. Sarcopenia is postulated to be an influential factor in chronic low back pain [87]. Further investigations are warranted to explore their potential involvement in IVDD. Finally, the association between human microbiota and the host in both healthy and disease states is a complex interplay rather than a simple one-way “causal relationship” [44]. Therefore, future studies should consider the intricate coordination and crosstalk between the host and gut microbiota to gain a better understanding of the relationship between gut microbiota and disease.

Conclusion

Using publicly available GWAS data, we conducted a bidirectional two-sample Mendelian randomization analysis on the causal association between 211 gut microbiota taxa and intervertebral disc degeneration (IVDD). Our analysis resulted in the identification of eight nominal causal associations and one strong correlation, further providing a theoretical foundation for the concept of the gut-disc axis. This study was based on a GWAS meta-analysis dataset generated from 16S rRNA sequencing, thus highlighting the need for analyses based on more advanced large-scale studies using metagenomics sequencing. Nevertheless, our research also offers valuable biomarkers for understanding the progression, diagnosis, and potential therapeutic approaches for IVDD.