Background

Breast cancer is one of the most common malignancies in women worldwide [1] and presents with different molecular subtypes, including luminal A, luminal B, HER2-enriched, and basal-like that also called triple negative [2]. As a major type of epigenetic modification, DNA methylation is involved in regulating cellular processes, including chromosomal instability [3] and gene expression. The hypermethylation of CpG regions in specific genes contribute to neoplastic formation through the transcriptional silencing of tumor suppressor genes. Aberrant patterns of specific gene methylation can help identifying differences in breast cancer subtypes [2], and showing promise for utilizing in large-scale epidemiological studies. It has been suggested that leukocyte DNA methylation, as a simple non-invasive blood marker [4, 5], could serve as a surrogate for systematic methylation activity and offers great potential for predicting the increased risk of breast cancer [6].

Wilm’s Tumor gene (WT1) is a tumor suppressor gene which involved in human cell growth and differentiation. WT1 has been reported to be significantly different methylated in the tissues of hepatocellular carcinoma [7], lung cancer [8] and breast cancer [9]. WT1 aberrant methylation may lead to a reduction or absence of WT1 expression, which results in the overexpression of the insulin-like growth factor I receptor (IGF 1R) and insulin-like growth factor II (IGF II), thereby promoting breast cancer process [10,11,12]. CA10 is a member of the carbonic anhydrase family, which is a large family of zinc-containing metalloenzymes that catalyze the reversible hydration of carbon dioxide and the dehydration of carbonic acid [13]. Ivanov et al. suggested that the induction or enhancement of carbonic anhydrase expression may contribute to the tumor microenvironment by maintaining an extracellular acidic pH and hel** the growth and metastasis of cancer cells [14]. Studies have demonstrated that the abnormal expression of carbonic anhydrase family by aberrant methylation is related with gastric cancer and the metastasis of ovary tumors [13, 15]. Furthermore, Wojdacz et al. reported that both WT1 and CA10 hypermethylation were significantly different between breast cancer tumor tissues and non-malignant tissues [16]. However, how the methylation of these two genes in leukocyte DNA affects breast cancer susceptibility remains unclear.

In this study, we investigated the associations between the methylation of WT1, CA10 in peripheral blood leukocyte DNA and breast cancer risk. We subsequently used an external dataset of a nested case-control cohort within the EPIC-Italy cohort study as external data to validate the association between gene methylation and breast cancer risk. We also investigated the associations between the methylation of these two genes and the risk of different molecular types of breast cancer.

Methods

Study subjects

We investigated the relationship between WT1 and CA10 methylation and breast cancer risk using a case-control study. All the included breast cancer patients were newly diagnosed females and were recruited from the Tumor Hospital of Harbin Medical University from 2010 to 2014. Female breast cancer subjects were included if they diagnosed with invasive ductal carcinoma (IDC) or ductal carcinoma in situ (DCIS), other types of breast cancer (such as lipoma of the breast, metastatic breast cancer, etc.) were excluded from our study. Controls were recruited from patients admitted to the Orthopedic and Ophthalmology Department of the Second Affiliated Hospital of Harbin Medical University and volunteers from the **angfang community of Harbin within the same period. All controls were also female. In addition, all control participants were asked about their disease history in a questionnaire, and individuals who reported a history of any cancer were excluded from our final subjects. Finally, 402 female breast cancer cases and 557 female controls were included in our study. Blood sample (5 mL) was collected from each participant and then stored at − 80 °C.

Data collection

All subjects were interviewed face-to-face by trained investigators with normalized questioning methods. The questionnaire was adopted from the study by Shu et al. [17], and included information on demographic information (age, ethnicity, and others); daily dietary intake (vegetables, fruits, beverages, and snacks); behaviors (smoking, drinking, physical activity and work activity); female-specific questions involving menstruation status, breast disease history (lobular hyperplasia, cyst, and others); gynecologic surgery history (hysterectomy, ovariotomy) and family history of cancer and breast cancer. The questions involved in dietary and behavioral were about the participants’ daily routine of 1 year prior to the interview. The basic demographic characteristics and environmental factors of the study subjects are presented in Table S1.

The study was validated with the GEO-GSE51032 (IPEC-Italy cohort) dataset with a nested case control study design to analyze the association between the methylation of CA10 and WT1 and breast cancer risk. The blood samples were also collected and other anthropometric measurements were taken. The sample selection criteria and the methods were reported by Riboli et al. [18]. We extracted all 232 female breast cancer cases and all 340 female controls from this nested case-control study and located the methylation probes from the Illumina 450 K array. The annotated CG sites covered by our MS-HRM sequence are illustrated in Fig. 1.

Fig. 1
figure 1

MS-HRM amplified sequence of WT1 and CA10 and the validated Cg sites in GSE51032

DNA extraction and bisulfite conversion

DNA was extracted from peripheral blood samples using a commercial DNA extraction kit (QIAamp DNA Blood Mini Kit, Hilden, Germany). The concentration and the purity of DNA were assessed using a Nanodrop 2000 Spectrophotometer (Thermo Scientific, USA). Genomic DNA was bisulfite-modified with an EpiTect Bisulfite kit (Qiagen, Hilden, Germany). Bisulfite DNA was normalized to a concentration of 20 ng/mL and was stored at − 20 °C for the following experiment. DNA extraction and DNA sodium bisulfite modification were performed according to the manufacturers’ instructions.

Gene methylation status analysis

We performed methylation-sensitive high-resolution melting analysis (MS-HRM) to evaluate the methylation of WT1 and CA10 with the LightCycler 480 system (Roche Applied Science, Mannheim, Germany) equipped with Gene Scanning software (version 2.0). The primers were adopted from a published study [16]. We used universal methylated and unmethylated DNA standards (ZYMO, USA) and mixed them at different ratios to create standards with a 0.5, 1, 2, and 5% methylation levels of WT1 and CA10 (Fig. 2). PCR amplification and MS-HRM were optimized and performed. The conditions, reaction mixture and primer sequences used in the MS-HRM experiments are listed in Table S23. Each standard reaction was performed in duplicate in each run. Each plate included duplicate water blanks as negative controls. We also repeated some samples in different runs to assess the consistency of the experiment. There was a significant agreement of these samples in different runs with respect to the observed methylation status of WT1 and CA10, with kappa value of 1.00 (P < 0.01) and 0.94 (P < 0.01), respectively (Table S4).

Fig. 2
figure 2

The MS-HRM based method for WT1 and CA10 methylation detection. The figures showed normalized melting curves and melting peaks for standards methylation level and of WT1(A)(B) and CA10(C)(D).The methylation status of the standards were 0, 0.5, 1, 2, 5, 100%, respectively

Definitions of different molecular subtypes of breast cancer

Four subtypes of breast cancer were defined as luminal A, luminal B, HER-2 enriched and triple negative breast cancer (TNBC) by immunohistochemical analysis based on previously validated clinicopathological criteria [19].

Statistical analysis

For the distribution of basic demographic characteristics, continuous variables such as age were analyzed by two-sample t-tests, and categorical variables were analyzed by chi-square (χ2) tests. For missing values in the environmental factors, we applied multiple imputation to generate possible values. To measure the association between methylation of WT1, CA10 and breast cancer risk and different molecular types breast cancer, we used univariate and multivariate unconditional logistic regression analyses to estimate the crude and adjusted odds ratios (ORs) and 95% confidence intervals (95% CIs). For our case-control study, we used 0% methylation as a cutoff for both WT1 and CA10. We used receiver operating characteristic curve (ROC) to calculate the cut-off value of β for the validation dataset. We also applied the propensity score (PS) method to adjust covariates (involving all environmental factors in the questionnaire), in which the study outcome served as the dependent variable and PS served as the confounding variable. Kappa values were calculated to analyze the consistency between same samples in different runs. All two-sided P values < 0.05 were considered statistically significant. Data were analyzed by using SPSS v.24.0 (SPSS Inc., Chicago, IL, USA).

Results

Characteristics of the cases and controls

This study included 402 female cases with a mean age of 51.75 ± 9.39 and 557 female controls with a mean age of 51.85 ± 10.31. Other demographic information of the cases and controls is listed in Table 1. The definition of variables for environmental factors with ≤5.8% missing data were processed by the multiple imputation method are presented in Table S1.

Table 1 Demographic characteristics of breast cancer patients and controls

Associations between WT1, CA10 methylation and breast cancer risk

WT1 methylation was associated with breast cancer risk both in multivariable and PS adjusted methods with ORs of 2.42 (95% CI: 1.45–4.04, P < 0.01) and 3.07 (95% CI: 1.67–5.64, P < 0.01), respectively. CA10 methylation was statistically significant associated with breast cancer in the multivariable adjustment with an OR of 1.53 (95% CI: 1.14–2.05, P < 0.01), but was only marginally associated with breast cancer after PS adjustment, with an OR of 1.35 (95% CI: 0.97–1.90, P = 0.08) (Table 2).

Table 2 The associations between gene methylation and risk of breast cancer and different molecular types of breast cancer

In the subgroup analyses, after PS adjustment, WT1 methylation was associated with breast cancer risk in both the younger (< 60-years-old) and older (≥60-years-old) groups, with ORs of 2.64 (95% CI: 1.31–5.32, P = 0.01) and 4.72 (95% CI: 1.31–16.97, P = 0.01), respectively. CA10 methylation was associated with breast cancer risk in younger age group (< 60-years-old) before PS adjustment, with OR of 1.56 (95% CI: 1.15–2.11, P = 0.01); However, the association was not statistically significant after PS adjustment (Table 3). We also analyzed the combination and interaction of age and the methylation of WT1, CA10 on the risk of breast cancer. The P values for the interactions between age and the methylation of WT1 and CA10 on the risk of breast cancer were 0.40 and 0.73, respectively. The results are presented in Table 4.

Table 3 The subgroup analysis of the associations between methylation of genes and the risk of breast cancer based on different age
Table 4 The interaction between age and gene methylations on the risk of breast cancer

Associations between methylation of WT1, CA10 and risk of different molecular types of breast cancer

WT1 methylation was significantly associated with the risk of luminal A subtype of breast cancer with multivariable adjusted OR of 2.61 (95% CI: 1.18–5.74, P = 0.02), and PS adjusted OR of 2.62 (95% CI: 1.11–6.20, P = 0.03). WT1 methylation was also significantly associated with the risk of luminal B subtype breast cancer with ORs of 2.49 (95% CI: 1.13–5.51, P = 0.02) and 3.23 (95% CI: 1.34–7.80, P = 0.01) after multivariable and PS adjustment. However, WT1 methylation was not significantly associated with the risk of HER-2 enriched and TNBC subtypes (Table 2).

The associations between CA10 methylation and the risk of luminal B subtype breast cancer with multivariable adjusted and PS adjusted ORs were 2.04 (95% CI: 1.30–3.21, P P < 0.01) and 1.80 (95% CI: 1.09–2.98, P = 0.02), respectively. However, CA10 methylation had no significant associations with the risk of luminal A, HER-2 enriched and TNBC subtypes after the adjustment of PS. The association between the methylation of WT1, CA10 and other clinicopathological characteristics of breast cancer patients were analyzed are showed in Table S5.

Association between WT1, CA10 methylation and breast cancer risk in GEO dataset

The GSE51032 dataset is a nested case control study that includes 233 female breast cancer cases and 340 female cancer-free controls. After the data extraction from the 450 K array, we identified two CG loci each in our targeted WT1 and CA10 sequences (Fig. 1). ROC curves were used to calculate the cut-off values of β, which were 0.057 and 0.226 for average β of probes in WT1 and CA10. The average methylation level of Cg14657517 and Cg19074340, which are located within the WT1 targeted sequence, was associated with breast cancer with OR of 1.88 (95% CI: 1.25–2.83, P = 0.03). However, the average methylation level of Cg14054928 and Cg20405017, which are located within the targeted CA10 sequence, was not statistically significant associated breast cancer risk (OR = 0.76, 95% CI: 0.54–1.06, P = 0.11) (Table 5).

Table 5 The association between gene average CpG sites methylation and risk of female breast cancer in GEO51032

Discussion

This is the first case-control study to investigate the associations between the methylation of WT1, CA10 in leukocyte DNA and breast cancer risk, and the risk of different molecular subtypes of breast cancer in a Chinese female population. After PS adjustment, we observed that methylation of WT1 was significantly elevated breast cancer risk by 2.07-fold, CA10 methylation was marginally associated with breast cancer risk with OR of 1.35. Women with WT1 methylation presented a 1.62 higher risk of luminal A and 2.23 higher risk of luminal B subtype of breast cancer than those without methylation. CA10 methylation was significantly associated with the risk of luminal B subtype with OR of 1.80. We subsequently used GEO-GSE51032 dataset, a nested case control study with clear temporal relationship between methylation changes and breast cancer, as an external dataset to validate our retrospective study. The nested case control study’s results showed a lower but still significant association between WT1 methylation and breast cancer risk, but the association between CA10 methylation and breast cancer risk was not statistically significant.

Breast cancer is a heterogeneous disease with different molecular subtypes, which may present different genetic and epigenetic susceptibilities. Previous studies predominantly focused on the aberrant methylation in tissue samples and its association with the risk of different molecular types of breast cancer [20, 21], with few studies having focused on the gene-specific methylation in leukocyte DNA. The methylation alternation in leukocyte DNA presented a response of the hematopoietic system [22]. Leukocyte DNA methylation can represent germline methylation, which can be used to analyze the association with cancer risk [23]. It was further reported that BRCA1 hypermethylation in peripheral blood DNA was associated with TNBC with an OR of 5.0 [24]. The results of our study indicated that after PS adjustment, WT1 methylation was associated with the risk of luminal A and luminal B subtypes of breast cancer with ORs of 2.62 and 3.23, and CA10 methylation was significantly associated with luminal B subtype of breast cancer with OR of 1.80.

WT1 is a zinc finger transcription factor located on 11p13, which was first identified as a tumor suppressor gene. WT1 exon displayed significantly increased methylation in cancer tissue compared to nonmalignant breast tissue [16]. WT1 methylation in the promoter and first exon region was shown to be associated with the silencing of WT1 mRNA expression in MCF-7 and MDA-MB-231 breast cancer cells [9]. Our investigated sequence was 160 bp downstream of the Laux et al. sequence position. Here, we observed methylation of the CpG island in the first exon of WT1 in blood leukocyte DNA, which contains 11 CpGs in the CpG island. Furthermore, we used external data from an IPEC- Italy cohort (GEO-GSE51032) with a nested case control study design and found the significant association between WT1 methylation and breast cancer risk, with two CpG probes inside our sequence, with OR of 1.88.

A previous study showed CA10 can undergo methylation during breast carcinogenesis in tumor tissue [16]. CA10 was reported to be hypermethylated among a panel of genes in urine, which may contribute to the highly accurate and early detection of bladder cancer [25]. The result of our study suggested that CA10 methylation in leukocyte DNA was marginally associated with an elevated breast cancer risk after PS adjustment. The amplified sequence contained 7 CpGs and located in the second exon of CA10. The external validation dataset of GEO-GSE51032 only included 2 CpG probes and did not exhibit a significant association between CA10 hypermethylation and breast cancer risk.

To further investigate the functional relevance of the observed associations, it would be important to test whether methylation of the specific CpGs in WT1 and CA10 associated with the alteration of their expression. Therefore, we investigated the correlations between methylation probes and expression using TCGA (http://cancergenome.nih.gov/) and Mexpress (https://mexpress.be/) databases. The results showed that WT1 hypermethylation was also negatively correlated with its expression (Cg14657517, r = − 0.204, P < 0.001; Cg19074340, r = − 0.201, P < 0.001), and CA10 hypermethylation was negatively related to its mRNA expression as well (Cg14054928, r = − 0.182, P < 0.001; Cg20405017, r = − 0.162, P < 0.001). Although discounted by different sample-derived DNA, the significant negative correlations between WT1, CA10 methylation and gene expression were consistent with our study and indicated promising potential in breast cancer risk assessment.

In our previous study, we tested the accuracy of MS-HRM by assessing the WT1 methylation level with both MS-HRM and pyrosequencing, and the results were highly correlated between these two methods [26]. However, the methylation level of leukocyte DNA is relatively low and the limitation of pyrosequencing is 2%. As a reliable and highly sensitive technique, MS-HRM can be used to assess the methylation level of targeted CpGs as low as 0.1% [27]. The high consistency of our results for different runs which making the non-misclassification of methylation level between case and control and the probability of higher sensitivity of MS-HRM comparing pyrosequencing can make our study results more conserved [28].

The limitations of this study should be taken into consideration before drawing a conclusions: first, as in all retrospective analyses, our study may have some recall bias when collecting information on environmental factors. Second, the sample size of our study is not large enough for subgroup analysis, including the subgroup analyses of low frequency environmental factors, such as smoking behavior, therefore their associations with DNA methylation of WT1, CA10 could not be analyzed. Third, selection bias may have occurred, since we recruited the breast cancer patient subjects at the Tumor Hospital of Harbin Medical University, which might not be representative of the distribution of breast cancer patients to some extent.

Conclusion

In summary, the results of our study suggested that methylation of WT1, CA10 in blood leukocytes may be associated with the risk of breast cancer. Associations between WT1 methylation and the risk of luminal subtypes of breast cancer and between CA10 methylation and the risk of luminal B subtype breast cancer were also observed.