1 Introduction

Complex diseases, such as cancer, arise from the intricate interplay of numerous genetic and environmental factors, most of which exert a relatively modest impact. Genetic variations, specifically Single Nucleotide Polymorphisms (SNPs), are expected to play a significant role in determining susceptibility to complex diseases like cancer. Lung cancer is one of the most prevalent and deadly malignancies worldwide [1]. Understanding the genetic factors associated with lung cancer susceptibility can provide valuable insights into its pathogenesis and potentially lead to improved prevention and treatment strategies.

CD147, also known as basigin or extracellular matrix metalloproteinase inducer (EMMPRIN/Basigin), a type I transmembrane glycoprotein, is a glycoprotein initially known as a regulator of matrix metalloproteinase (MMPs), and inhibition of CD147 may represent a promising therapeutic strategy [2]. Previous studies have shown that CD147 interacts with caveolin-1 [3], MCT1, MCT4 [4], and beta1-integrins [5]. CD147, a multifunctional molecule implicated in the progression and metastasis of cancer, has significant therapeutic potential in diverse diseases, encompassing lung cancer, inflammation, and even COVID-19 [6]. The inhibition of CD147 via Thai traditional medicines [7] and targeted agents such as Formosanin C has demonstrated promising outcomes in reducing CD147 expression levels and impeding the advancement of non-small-cell lung cancer by disrupting MCT4/CD147-mediated lactate export [8]. CD147 is significantly upregulated in multiple cancers and some studies showed that CD147 expression was associated with the worse overall survival (OS) [36].

One of the significant strengths of our study is the similarity in allele frequency differences between our study and all population (African, American, East Asian, European and South Asian) (Ensembl database, release 111). This similarity potentially enables the generalization of our results to a wider range of populations.

We must acknowledge several limitations in our study. Firstly, the relatively modest sample size might have constrained the statistical power to identify small effect size. Secondly, despite adjusting for known confounding factors like age and smoking status, there may still be residual confounding or unmeasured variables. Nonetheless, further comprehensive research is planned to be conducted in the future, including replication studies involving diverse ethnic groups and meta-analyses. These endeavors will facilitate the assessment of the consistency of our findings across diverse populations.

The study findings offer valuable insights for improving lung cancer risk assessment, particularly in Chinese population. Although no significant associations were found between these three TagSNPs of CD147 and lung cancer susceptibility in the general population, stratification analyses revealed important subgroup-specific risks. For females, the rs79361899 AA and AG genotype are associated with the increased lung cancer risk and an interaction analysis demonstrate significant gene × gender interactions in the rs79361899 recessive model, which suggesting the need for gender-specific genetic screening. Similarly, the rs28992491 GG and rs79361899 AA genotype are linked to higher lung cancer susceptibility in individuals aged 65 and older, indicating that the important for considering age-specific genetic factors in lung cancer risk assessment.

Future studies should involve diverse ethnic groups to determine the wider applicability of our findings across different populations. Additionally, exploring the biological mechanisms by which CD147 polymorphisms influence lung cancer susceptibility will provide deeper insights into the role of CD147 in cancer progression.

In conclusion, our study investigated the association between CD147 gene polymorphisms and the risk of lung cancer in a Han Chinese population and uncovered potential gender and age-specific effects of specific genotypes on lung cancer susceptibility. These findings highlight the significance of incorporating genetic variations and their interactions with demographic factors in comprehending the intricate etiology of lung cancer.