Introduction

Stroke is one of the most prevalent neurological disorders and is a major cause of disability and death among middle-aged and elderly individuals, posing a significant public health concern on a global scale [1]. According to the Global Burden of Disease estimation in 2019, stroke incidence was 12.2 million cases, the prevalent cases of stroke were 101 million, the number of disability-adjusted life-years was 143 million, and the number of deaths caused by stroke was 6.55 million[2]. Stroke has various subtypes, with ischemic stroke most commonly involved. Ischemic stroke can be further divided into three subtypes: large artery stroke (LAS), cardioembolic stroke (CES), and small vessel stroke (SVS) [3]. Furthermore, stroke includes intracerebral hemorrhage (ICH) and subarachnoid hemorrhage (SAH) [4]. Transient ischemic attack (TIA) is a robust predictor of stroke and is considered a minor stroke [5]. White matter hyperintensities (WMH) and brain microbleeds (BMB) are important risk factors for ischemic stroke [6] and ICH [7]. While the pathological processes vary among different stroke subtypes, they all involve the death of nerve cells [8]. Despite several studies on the nature of stroke, the biological mechanisms and risk factors underlying its occurrence remain unclear. Identifying modifiable risk factors for stroke is crucial for develo** preventative interventions.

Recently, the connection between metabolomics and stroke has gained attention. Metabolomics is used for biomarker discovery, providing insights into the processes of disease occurrence and progression by uncovering altered metabolic pathways and intermediate metabolites [9]. Metabolites are the end products or intermediate compounds in metabolism that provide essential functions in the human body. Multiple studies have demonstrated that metabolites are functional intermediates that can elucidate the potential biological mechanisms underlying disease genetics [26]. Among the 486 metabolites, 309 are known, while 177 are unknown. According to the Kyoto encyclopedia of genes and genomes (KEGG) database, the known metabolites can be assigned to eight broad metabolic categories: cofactors and vitamins, energy, lipids, nucleotides, peptides, amino acids, carbohydrates, and xenobiotics. Herein, we excluded 34 metabolite traits that could not be assigned IVs, leaving us with a subset of 452 serum metabolites for further analysis.

Instrumental variables selection

Herein, we selected SNPs with p-values below the locus-wide significance level (1 × 10–5) in the initial analysis as IVs to obtain comprehensive results and enhance sensitivity to IVs. Subsequently, all IVs underwent linkage disequilibrium (LD) clum** (r2 = 0.01; distance = 5000 kb) to mitigate the influence of correlated SNPs. Furthermore, Phenoscanner (http://www.phenoscanner.medschl.cam.ac.uk/) was screened to identify the potential pleiotropic effects. Additionally, we calculated the F-statistic [R2 (N–2)/(1–R2)], which assesses the strength of each instrument, where R2 represents the proportion of variance explained by the genetic instrument, and N is the effective sample size of GWAS [27]. The SNPs with an F-statistic threshold greater than ten were chosen for the subsequent MR analysis as they provided a reliable estimate of genetic variation [28]. Finally, we excluded palindromic SNPs [29] (where the effective allele is unclear) from our study.

Data sources on the stroke and its subtypes

Stroke is classified based on the clinical criteria defined by the World Health Organization (WHO) and the tenth edition of the International Classification of Diseases (ICD-10) [30]. Data for certain stroke subtypes were sourced from publicly available summary data provided by the MEGASTROKE consortium [14]. The MEGASTROKE consortium encompassed 446,696 individuals of European ancestry (40,585 any stroke (AS) cases and 406,111 controls). Within any ischemic stroke (AIS) category, there were 34,217 cases of overall AIS, 4,373 cases of LAS, 7,193 cases of CES, and 5,386 cases of SVS. Although the MEGASTROKE study included results for SVS, the cases were defined based on Trial of Org 10172 in Acute Stroke Treatment criteria [3] and did not specifically focus on MRI findings. Therefore, we conducted a study focusing on small vessel infarction using a sample of recent lacunar stroke (LS) cases, comprising 6,030 cases and 248,929 controls [31]. The WMH and BMB were imaging markers of cerebral microstructural damage [32]. The WMH is an increased brightness on T2-weighted brain images [33]. The BMBs are small, low-signal lesions identified on magnetic susceptibility-weighted imaging sequences or T2-weighted gradient-recalled echo sequences [34]. The summary data for WMH (N = 32,114) were derived from an expanded set of a recent GWAS study of brain imaging phenotypes conducted by the UK Biobank [35]. For BMBs, we could not find a GWAS specifically focused on individuals of European ancestry. However, in a recent multi-ethnic GWAS study on BMBs [36], we identified 2889 cases of microbleeds among the remaining 23,032 individuals after excluding patients with dementia and stroke. Summary-level data for the remaining stroke subtypes were generated from the latest FinnGen R9 Biobank [37], which included 3,749 cases of ICH, 3289 cases of SAH, and 18,398 cases of TIA. Further information on the GWAS can accessed at (https://www.finngen.fi/en). The studies in these consortia obtained approval from local research ethics committees and institutional review boards, and all participants provided written informed consent. Table 1 shows the characteristics of summarized datasets for the stroke subtypes.

Table 1 Characteristics of the summary datasets for stroke

MR analysis

Herein, a two-sample MR analysis was utilized to evaluate the causal correlation between serum metabolites and stroke and its subtypes. Subsequently, the fixed-effects or random-effects IVW method was employed as the primary MR analysis. The choice between the fixed-effects and random-effects IVW methods depends on heterogeneity and pleiotropy. The fixed-effects IVW model estimates are given higher significance when neither heterogeneity nor pleiotropy exists. In cases of heterogeneity without pleiotropy, we favor the random-effects IVW model. The random-effects IVW method is chosen for its ability to provide unbiased estimates by accounting for potential horizontal pleiotropy and striving to achieve balance in this context [38]. To enhance the robustness of results, we employed the MR-Egger method, weighted median analysis, and MR pleiotropy residual sum and outlier (MR-PRESSO) test as sensitivity analysis methods. The MR-Egger method considers directional horizontal pleiotropic effects. Whenever the intercept term significantly deviates from zero, it indicates the presence of invalid instruments and suggests potential bias in the IVW method [39]. The I2 value and Q-test assessed the potential heterogeneity and identified outliers in the IVW and MR-Egger analyses. The weighted median analysis requires at least half of the instruments to be valid, and the final overall MR estimate is determined by taking the median of causal estimates from each SNP [40]. The MR-PRESSO test was also conducted to identify potential horizontal pleiotropy and correct for its impact by removing outliers [41]. Additionally, we performed leave-one-out analyses to further evaluate the robustness of associations observed by individual SNP drivers.

Herein, a p-value less than 0.05 was considered a nominal association. False Discovery Rate (FDR) correction was employed to control for false positives in multiple tests [42]. Associations were considered statistically significant if the estimated causal effect of a given metabolite had an FDR value of < 0.05. The statistical power (> 80%) was estimated using the mRnd power calculator (http://cnsgenomics.com/shiny/mRnd/) [11, 46]. Blood is the most used sample source for metabolomics identification because it contains numerous detectable metabolites and can be easily obtained in large sample sizes, facilitating the screening of circulating biomarkers for stroke risk [81]. Considering cholesterol is also present in various stroke outcomes, we can infer that cholesterol may influence the occurrence and progression of stroke by affecting bile acid metabolism. Additionally, the metabolites involved in steroid degradation include cholesterol, and this pathway is associated with multiple stroke outcomes. Therefore, we hypothesize that cholesterol may affect steroid degradation, thereby influencing stroke occurrence. The pathways of glutathione metabolism, pantothenate, and CoA biosynthesis, arginine biosynthesis, aminoacyl-tRNA biosynthesis, and alanine, aspartate, and glutamate metabolism are also present in multiple stroke outcomes involving the metabolites glutamate and aspartate. Aspartate remains stable in our results, allowing us to infer its influence on the mentioned pathways and its impact on stroke progression. However, glutamate is non-robust in our results, indicating the need for further research to validate it.

Additionally, there are differences between the metabolites primarily involved in pathway analysis and those in our significant results. Therefore, there may be other unexplored metabolic pathways. Further research is required to better understand the relationships between metabolites, metabolic pathways, and stroke outcomes. Moreover, we have also discovered the involvement of certain novel pathways in stroke pathogenesis, including “styrene degradation” playing a significant role in ICH and “clavulanic acid biosynthesis” in SAH. The specific mechanisms behind these findings also require further investigation.

The use of drugs can influence changes in the metabolite profile. For example, statin medications lead to extensive lipid alterations and effectively reduce cholesterol levels [82]. Recently, develo** neuroprotective peptide drugs has influenced the occurrence of stroke by affecting the metabolism of amino acids in the human body, including glutamate and aspartate [83]. Additionally, some cardiovascular drugs can influence P-glycoprotein, which regulates the absorption and excretion of xenobiotics. The P-glycoprotein is associated with ischemic stroke in mouse models [84]. Therefore, the use of drugs in stroke patients may interfere with measuring metabolites, emphasizing a challenge for specific research on the impact of a particular metabolite on stroke in the future.

Based on our findings and cross-validation with RCT trial results, this provides early predictive factors for future research on the utility of these biomarkers in blood tests for stroke prevention. This study suggests that targeting certain metabolites may be a promising area for future medication development in treating stroke.

Our study has several strengths. First, a major strength of this study lies in its extensive coverage of genetic variables to comprehensively analyze the genetically determined relation between blood metabolites and the eleven stroke phenotypes. Meanwhile, the genome-wide dataset for stroke subtypes genetic variables primarily utilized populations of European ancestry to mitigate potential biases arising from population differences. Second, using bidirectional MR designs largely avoided reverse causation and residual confounding. Third, applying the largest available dataset on various stroke subtypes in the field, along with extensive sensitivity analyses, ensured the robustness of our findings.

Nevertheless, this study has certain limitations that should be acknowledged. First, we leveraged exposure-specific GWAS data and outcomes from publicly available summary data, with potential sample overlaps that might introduce confounding biases. Additionally, the distinct data sources in this study may correspond to different population groups. These samples could exhibit substantial variations in population characteristics, including age, gender, and socioeconomic background. Such distinctions can potentially influence the interpretation of causal estimates and the validity of causal inferences. Second, owing to the relatively limited number of participants in the exposure dataset and the restricted range of metabolite types, some associations between different metabolites and stroke might be missing. Third, the study participants primarily consisted of individuals of European descent, necessitating an assessment of the generalizability of our findings to other populations. Fourth, to enhance the reliability of our findings, we employed multiple correction analyses. However, this approach might overlook potential metabolites causally related to stroke. Fifth, some metabolites and metabolic pathways covered in this study have not been fully elucidated regarding their functions and mechanisms in diseases, which limits our interpretation of the MR analysis results. Lastly, due to the limited variance explained by SNPs or sample size constraints in GWAS results, some of our MR analyses might lack sufficient power to detect small effects. Future investigations utilizing larger GWAS datasets promise to provide enhanced statistical power and more precise assessments of the genetic influences on metabolites. Although MR has assisted in identifying blood metabolites associated with stroke, there remains a need for prospective studies to delve into their potential mechanisms.

Conclusion

This two-sample MR study revealed the significant role of serum metabolites in the risk of 11 stroke subtypes. Identifying 28 remarkable causal associations between 25 metabolites and 9 stroke phenotypes, 40 significant metabolic pathways in 11 stroke phenotypes, and nominal causal associations of other metabolites contribute to our understanding of the intricate interplay between metabolites and the brain in the development of stroke. Moreover, they offer valuable potential as circulating metabolic biomarkers, holding promise for their application in stroke screening and preventive strategies within clinical settings. These findings contribute to the understanding of biological mechanisms underlying stroke and pave the way for future exploration of targeted therapeutic interventions.