Background

Colorectal cancer (CRC) is the third most common cancer type worldwide [1]. It is a heterogeneous disease in terms of high biological complexity and clinical outcome. Extracellular vesicles (EVs) are known to contribute to tumorigenesis, progression, and drug resistance in CRC [2] and may be important CRC biomarkers [2) have been published by Levin et al.[14] and 9 primers (Supplementary Table S3) were a combination of the published primers. The process of library preparation followed the suggested protocols of Human mtDNA Genome for the Illumina Sequencing platform (Illumina, Inc., San Diego, CA) with the first steps of the protocol adjusted. The master mixes were divided into 4 PCR tubes per sample, and a temperature gradient (51–68 °C) was used during the first amplification. The DNA was subsequently purified using gel electrophoresis, and bands representing the circular mtDNA amplicon (9.1 kilobases and 11.2 kilobases) or short PCR products were cut out from the gel. Extraction and quantification of mtDNA were performed using QIAEX II Gel Extraction Kit (Qiagen) and Qubit dsDNA HS Assay (Thermo Fisher Scientific). Successful long-range PCRs were represented by a bright band of the expected size. The amplicons were pooled and libraries were generated using the Nextera XT DNA Library Preparation Kit and Nextera XT Index Kit (both Illumina). AMPure XP beads (Beckman Coulter, Brea, CA, USA) were used to purify the DNA library and provide a size-selection step to remove short library fragments. Bioanalyzer-based normalization was performed using the Agilent High-Sensitivity DNA Kit (Agilent Technologies, Waldbronn, Germany) and the libraries were pooled and sequenced on a MiSeq Benchtop Sequencer (Illumina) using a MiSeq Reagent Kit v3 (Illumina) with 2 × 300-basepair read lengths.

mtDNA variant analyses

All sequence data generated was mapped to the revised Cambridge Reference Sequence (GenBank ID NC_012920.1) [15, 16] using the MiSeq Reporter built-in software v2.6 (Illumina). This software applies a Burrows-Wheeler Aligner [17] and generates BAM alignment files. The Mutserve via mtDNA-Server (https://mtdna-server.uibk.ac.at) [18] was used for variant calling and annotation with default parameters and filter settings; Minimum Base Call Quality Score for a Call (< 30), Indel Repeat Length (> 8), and Low Variant Frequency (< 0.010). Only variants with final filter pass were included for downstream analysis. This variant caller has various internal quality controls and was shown to have best performance compared to other variant callers in regards to evaluating heteroplasmy [19]. Variant frequencies > 0.990 were defined as homoplasmy, while heteroplasmy was defined by frequencies of 0.10–0.990 and low-level heteroplasmy by < 0.10. Variants flagged as previously reported nuclear mitochondrial DNA (NUMTs) by Mutserve were identified. Haplocheck v1.3.2 was used to detect contamination in the mtDNA samples [20] and Haplogrep v2.3.0 for haplogroup classification [21] through the mtDNA-Server. The Ensembl Variant Effect Predictor software was used with default parameters to predict the potential role of the variants [22]. An mtDNA circular plot was made in Geneious (v2023.0).

Quantification of mtDNA damage

This assay relies on the ability of a modification on the template DNA to inhibit restriction enzyme cleavage, as detailed previously [23]. Total DNA from the DNase-treated and non-treated pooled patient samples was analyzed with droplet digital PCR. A sequence flanking a TaqI restriction enzyme site in the 12S ribosomal RNA gene (MT-RNR1) was amplified using the forward (5′- AAA CTG CTC GCC AGA ACA CT -3′) and reverse (5′- CAT GGG CTA CAC CTT GAC CT-3′) primers in the absence and presence of the enzyme. The samples were partitioned by the QX200 Droplet Generator (Bio-Rad Laboratories, Oslo, Norway) and analyzed with the QX200 Droplet Reader (Bio-Rad Laboratories). The data was given as the percentage of non-digested (nd) mtDNA [(mtDNATaqI copies per μl – mtDNAnd copies per μl) × 100].

Statistical considerations

Analyses were performed using GraphPad Prism v9.2.0. Differences between groups were compared by paired Student’s t-test or Repeated-Measures ANOVA with Dunnett’s multiple comparison tests when comparing one variable in matched samples from patients and when comparing several variables, a 2-way ANOVA was used. Mann–Whitney U test was used when comparing differences between the cancer types and patient groups. p-values less than 0.05 were considered statistically significant. Pearson correlation analysis was performed.

Results

mtDNA variant number and heteroplasmy – rectal cancer patients

The median coverage depth was 15 237 × . After processing the sequences with adequate quality scores (Q30, median of 89.9%), the proportions of aligned sequence reads for WB, PBMCs, EVs, and FFPE tumor tissue were 98.7%, 99.7%, 99.8%, and 99.8%, respectively. The ratio of transversions and transitions, and GC content across the mitochondrial genomes were comparable for all four tissue specimens from the eight rectal cancer patients (Supplementary Table S4). As expected [24], the total number of variants was similar in WB and PBMCs with a median of 41.0 (range, 31–107) and 36.5 (range, 27–58), respectively (Fig. 1a). When comparing all tissue types, differences were detected (Repeated-Measures ANOVA: sample types (column) p < 0.0001 and patients (row) p = 0.71).

Fig. 1
figure 1

Mitochondrial DNA variant number and the level of heteroplasmy in the colorectal cancer patients. Total variant number in (A) whole blood (WB), peripheral blood mononuclear cells (PBMC), plasma extracellular vesicles (EV), and formalin-fixed paraffin-embedded (FFPE) tumor specimens from rectal cancer patients (n = 8) and in (B) WB, EV, and fresh-frozen (FF) tumor specimens from colon cancer patients (n = 8). Points connected with a line represent matched patient specimens. The mean number of all variants detected in each sample type was compared to the EV number by Dunnett´s multiple comparison. Venn diagram of all variants detected in the different sample types from (C) rectal (n = 8) and (D) colon (n = 8) cancer patients. Heteroplasmy levels in the different sample types from (E) rectal (n = 8) and (F) colon (n = 8) cancer patients. Low-level heteroplasmy (< 0.10) variants in the different sample types from (G) rectal (n = 8) and (H) colon (n = 8) cancer patients. Points connected with a line represent matched patient specimens. The mean number of all variants detected in each sample type was compared to the EV number by Dunnett´s multiple comparison. p-values for all relevant panels: *, 0.033; **, 0.0021; ***, 0.0002;****, < 0.0001

Characterization of plasma EVs by NTA showed median concentration of 1.85 × 10^9 (range, 1.5 × 10^9–2.1 × 10^10) particles/ml and median mode size of 124.3 (range, 102.0–255.9) nm (Fig. 2a, b). The NTA histogram (Fig. 2c), western blot (Fig. 2d; full-length blots are presented in Supplementary Figure S1), and TEM images (Fig. 2e) from the selected patient sample confirmed vesicles of various sizes with the expected cup shape and the presence of expected EV proteins (CD63, CD9, ALIX). The heterogeneous EV sample showed absence of the contamination marker from the Golgi apparatus (GM130) but APOA1, a protein found in high-density lipoproteins, was detected. The samples were pre-treated with DNase and Proteinase to eliminate contaminating molecules adherent to the EV surface or present in plasma, with a significant reduction (19%) in total DNA concentration (Supplementary Figure S2a; paired t-test: p = 0.0005). To examine whether the DNase treatment might artificially generate new mtDNA variants, the samples treated with and without DNase were analyzed for damage in MT-RNR1, with similar damage level (Supplementary Figure S2b; paired t-test: p = 0.49). For the total rectal cancer cohort, plasma EVs presented twice as many mtDNA variants compared to WB and PBMCs, with the median number of 113.0 (range, 70–224; Dunnett's multiple comparison: p = 0.046 and p = 0.020, respectively; Fig. 1a).

Fig. 2
figure 2

Characterization of plasma extracellular vesicles (EVs) from the colorectal cancer patients. A Concentration (mean ± standard error) and (B) mode size (mean ± standard error) of vesicles (n = 16) measured by Nanoparticle Tracking Analysis (NTA). The mean value of each cancer type was compared by Mann–Whitney U test (ns, not significant). C NTA histogram (concentration and size (mean ± standard error) for the experiment) of three combined 60-s videos from a representative EV sample. D Western blot images of the EV markers CD63, CD9, and ALIX and contamination markers GM130 and APOA1 expression in EVs from one patient. HCT116 colorectal cancer cells were used as positive control (Pos ctr). Full-length blots are presented in Supplementary Figure S1. E Transmission electron microscopy images of EVs from one patient. The positions of the zoom in panels within the wide-field views are indicated by black boxes; scale bars are 200 nm (black) and 500 nm (black and white)

The FFPE tumors showed significantly higher mtDNA variant number than the other tissues, with median of 327.5 (range, 167–391; Dunnett's multiple comparison: p < 0.0001; Fig. 1a), corresponding to approximately sevenfold the WB number. This finding was not unexpected as the FFPE-derived DNA was fragmented and the mtDNA genome was amplified in a small-amplicon format that possibly could introduce false positives. The fixation process impacts the quality of the DNA with undesirable modifications such as deamination that introduces C:G > T:A mutations, and these substitutions were associated with TVN (Pearson correlation: r = 0.93, p = 0.0009); hence, the FFPE-data must be considered with care because of the different pre-processing protocols. Both FFPE-tumor tissue and EVs had considerable amounts of private variants not detected in WB or PBMCs. 104 variants (4.8%) were shared among all the sample types from the rectal cancer patients and the EVs and FFPE tumors had most overlap in variants among any two types (Fig. 1c). The mtDNA variants were also analyzed for the level of heteroplasmy, as an initial investigation into the diversity of the mitochondrial genomes in the various tissues (Fig. 1e). As shown in Fig. 1g, low-level heteroplasmy (< 0.10) mtDNA variants were more frequent in the EVs than in WB and PBMCs (Repeated-Measures ANOVA: sample types < 0.0001 and patients p = 0.84; Dunnett's multiple comparison: p = 0.042 and p = 0.017, respectively), whereas FFPE tumor samples had increased number of low-level heteroplasmy compared to EVs (Dunnett's multiple comparison: p < 0.0001).

mtDNA variant numbers and heteroplasmy – colon cancer patients

We sequenced WB, EVs, and FF tumor samples (Q30, median of 88.4%) available from eight colon cancer patients to investigate further if EVs and tumor tissue hold increased mtDNA variants. Here, all biospecimens were rapidly frozen, circumventing the effects of artificial mutations induced by formalin fixation. The median sequencing coverage depth was 12 289 × , and the proportion of reads map** to the reference mitochondrial genome of WB, EVs, and FF tumor tissue were 99.3%, 99.7%, and 99.5%, respectively. The ratio of transversions and transitions, and GC content were comparable for WB, EVs, and FF tumors (Supplementary Table S4). When comparing the various tissue types, differences were detected between groups (Repeated-Measures ANOVA: sample types p = 0.0061and patients p = 0.24).

The colon cancer plasma EVs had median concentration of 8.03 × 10^9 (range, 1.3 × 10^9–3.0 × 10^10) particles/ml and median mode size of 119.3 (range, 105.4–165.5) nm (Fig. 2a, b). We could verify an increased total number of mtDNA variants in EVs (median 86.5; range, 54–202) compared to WB (median 58; range, 34–115) (Dunnett's multiple comparison: p = 0.021) as well as to FF tumor tissue (Dunnett's multiple comparison: p = 0.0047; Fig. 1b). 125 variants (16.3%) were shared among all sample types from the colon cancer patients, and although WB and FF tumor (median 51; range, 53–66) had similar total variant numbers, tissue-specific features appeared (Fig. 1d). As also shown in Fig. 1d, similar to the rectal cancer patient, the EVs contained extensive exclusive variants and the EVs overlapped with the tumor tissue to a higher degree than WB did. All EV mtDNA sequencing data was used to further emphasize the full mtDNA genome present inside the vesicles, represented by a circular plot (Supplementary Figure S3).

Figure 1f shows the colon cancer patients’ mtDNA variants represented as homoplasmic, heteroplasmic, and low-level heteroplasmic states. Low-level heteroplasmy variants were abundant in the plasma EVs compared to WB and FF tumor (Repeated-Measures ANOVA: sample types p = 0.0066 and patients p = 0.53; Dunnett’s multiple comparison: p = 0.033 and p = 0.0044, respectively; Fig. 1h).

Sequencing the complete mitochondrial genome in FFPE tissues necessitated multiple PCR primer sets for mtDNA amplification because the long-range PCR did not amplify successfully. In order to investigate if the different mtDNA amplification methods could explain the differences in variant number, the multi-primer method was applied on WB samples and the results compared with the original (two primer pairs) sequence data from three of the colon cancer patients. The multi-primer method (the technical quality of the sequence reads is shown in Supplementary Table S4) yielded an approximately tenfold increase in variant number to median 612 (range, 562–721) from 60 (range, 53–66), pointing to a bias with the use of multiple primers (Supplementary Figure S4; paired t-test: p = 0.0057). Amplifying the mtDNA genome in small-amplicon format can increase the risk of involving NUMTs segments that can be misinterpreted as mtDNA heteroplasmy. Indeed, a considerably increased risk of NUMTs co-amplification with the multi-primer approach was observed for the WB samples (Supplementary Table S5). The FFPE tissue had an increased chance of co-amplified NUMTs compared to the other samples, and also indicated cross-contamination (Supplementary Table S5), making FF tumor tissue more appropriate for analysis of the full mitochondrial genome.

Distribution of variants within the mitochondrial genome

Among the mtDNA variants detected in rectal cancer patient samples, 39.6% in WB, 40.0% in PBMCs, 54.0% in plasma EVs, and 52.0% in FFPE tumors were distributed along protein-coding regions and the remaining in non-coding-regions (D-loop, rRNA, tRNA, intergenic regions). In the colon cancer patient samples, the proportions were 42.7% of the WB variants, 48.4% in EVs, and 47.8% in FF tumor. Normal-cell heteroplasmies tend to cluster within the non-coding D-loop, whereas tumor-specific somatic mutations are more evenly dispersed across both coding and non-coding regions [25]. Interestingly, for both cancer types, coding region variants versus D-loop variants was highest in EVs (4.2 for rectal cancer, 3.8 for colon cancer) and higher in tumor tissue (2.3 for both FFPE rectal cancer specimens and FF colon cancer specimens) than in WB and PBMCs (1.6 for the rectal cancer samples, 1.8 for the colon cancer samples), suggesting that plasma EVs contained molecular information towards acquisition of functional variants in their mtDNA.

We found differences between the sample types and the various regions of the mtDNA in both the rectal (2-way ANOVA: regions p < 0.0001 and sample types p < 0.0001) and colon (2-way ANOVA: regions p < 0.0001 and sample types p = 0.038) cancer patients (Fig. 3). For the 13 mitochondrial genes, the majority showed quite similar numbers in WB and PBMCs and higher numbers in EVs and FFPE tumors from rectal cancer patients (Fig. 3a). In the colon cancer specimens (Fig. 3b), WB and FF tumor showed quite similar numbers of variants for the genes, while EVs had higher variant numbers. MT-ND5 (NADH dehydrogenase, subunit 5 of complex I) had most variants in all samples, except in FFPE tissue with MT-ND4 (NADH dehydrogenase, subunits 4 of complex I) as most affected. Based on mutations per kilobase (Fig. 3a, b), MT-ND5 and MT-ND4 were also the most affected in WB and FFPE tissue samples, respectively, from the rectal cancer patients. Generally, in plasma EVs, MT-ND1 (NADH dehydrogenase, subunit 1 of complex I) and MT-CO1 (cytochrome c oxidase, subunit 1 of complex IV) had most mutations per kilobase.

Fig. 3
figure 3

Variant numbers of the individual mitochondrial genes, including non-coding regions, in the colorectal cancer patients. Mean ± standard error of the variant number (upper panels) in each mitochondrial gene and the non-coding regions (D-loop, rRNA, tRNA) and the variant mutation rate per kilobase (kb; lower panels) are shown for each gene, as analyzed in whole blood (WB), peripheral blood mononuclear cells (PBMC), plasma extracellular vesicles (EV), and formalin-fixed paraffin-embedded (FFPE) or fresh-frozen (FF) tumor specimens from (A) rectal cancer patients (n = 8) and (B) colon cancer patients (n = 8). The mitochondrial complex of which each gene encodes a protein subunit for, is shown at the bottom

Predicted variant effects on the protein structure

Finally, we determined potential consequences of the variants in the coding mtDNA sequences (Fig. 4a, b) to explore whether protein structures might be affected. The tumor tissues and plasma EVs contained nonsense mutations that cause premature stop codons estimated to have high effect on the protein function (disruptive, probably causing protein truncation, loss of function, or trigging nonsense-mediated decay). These were not detected in WB and PMBCs. In addition, the EVs were more abundant in missense mutations (Fig. 4c, d) that change the amino acid and normally have moderate effects on the transcripts (non-disruptive, might change effectiveness) in both rectal (Repeated-Measures ANOVA: sample types p = 0.0002 and patients p = 0.0050; Dunnett’s multiple comparison: EVs versus WB p = 0.0005, EVs versus PBMCs p = 0.0004, EVs versus FFPE tumor p = 0.0004) and colon (Repeated-Measures ANOVA: sample types p = 0.0049 and patients p < 0.0001; Dunnett's multiple comparison: EVs versus WB p = 0.020 and EVs versus FF tumor p = 0.0036) cancer patients. The synonymous variants are estimated to have low effect on protein and assumed to be mostly harmless or unlikely to change protein behavior. Coding sequence variants are listed as modifiers and their predictions are difficult.

Fig. 4
figure 4

Variant effector predictions of mitochondrial DNA in the colorectal cancer patients. Distribution of mutation types in coding sequences of whole blood (WB), peripheral blood mononuclear cells (PBMC), plasma extracellular vesicles (EV), and formalin-fixed paraffin-embedded (FFPE) or fresh-frozen (FF) tumor specimens in (A) rectal (n = 8) and (B) colon (n = 8) cancer patients. Missense mutation burden (in %) of WB, PBMC, plasma EV, and FFPE tumor specimens from (C) rectal cancer (n = 8) patients and WB, plasma EV, and FF tumor specimens from (D) colon cancer (n = 8) patients. Points connected with a line represent matched patient specimens. The mean number of all variants detected in each sample type was compared to the EV number by Dunnett´s multiple comparison (*, p = 0.033; **, p = 0.0021; ***, p = 0.0002; ****, p < 0.0001)

Biological relevance of circulating mtDNA

We further investigated whether circulating mtDNA characteristics were dependent on the patient TN-status (Supplementary Figure S5). Higher degree of low-level heteroplasmy (< 0.10) mtDNA variants was observed in EVs from patients with lymph node metastasis (N1-2; Mann–Whitney U test: p = 0.046). Taken together, the EV mtDNA pointed towards a complex composition of secreted mtDNA with more low-level heteroplasmy and variants with potential impact on the transcripts.

Discussion

In this study, for the first time we successfully isolated and sequenced the mtDNA cargo of plasma EVs for comparison with the mtDNA in WB, PBMCs, and tumor tissue from CRC patients. The EVs were abundant in mtDNA with more complex composition, including a higher degree of low-level heteroplasmy, compared to WB as reference. The EVs had numerous private mtDNA variants not detected in WB, PBMCs, or tumor tissue. The variants clustered in the coding regions, forming mutations with impact on the transcripts. Our data also highlights the possibility of analyzing the full mitochondrial genome of FFPE tumor tissue, but the technical requirements implied that FF tumor tissue was more expedient for the purpose. In both cases, a higher overlap of variants was detected between EVs and tumor tissue than WB and tumor tissue, suggesting that circulating EV mtDNA could be interesting to further study.

We have previously shown that plasma EVs from the rectal cancer patients, when fed to cultured human monocytes, caused monocyte transcriptional responses comprising protein binding, apoptotic mitochondrial changes, immune cell signaling, and cell growth, among other biological processes [26]. In the present study, we verified that EVs contained the intact mitochondrial genome. The total DNA concentration of the EV samples was reduced by the DNase and proteinase treatment, but it did not affect the total mtDNA variant number (data not shown) nor did it cause more mtDNA (MT-RNR1) damage. Overall, this suggests that the full genome is present and protected inside the vesicles.

Replication is the primary source of new mutations in mtDNA. The mutation rate observed in mtDNA is 10–17 times higher than that of the nuclear genome, and caused by the lack of histones, efficient DNA repair mechanisms, and the proximity to reactive oxygen species generated by oxydative phosphorylation [27]. Tumor cells have altered bioenergetic processes, such as increased glucose metabolism, altered calcium regulation, altered production of reactive oxygen species, or altered interorganelle interaction. These changes may result from pre-existing or de novo mutations of nuclear- or mtDNA, changes in gene copy number, or altered gene expression [5]. The plasma EV mtDNA variants were in both the coding and non-coding regions but with high protein-coding region to D-loop ratio, suggesting that EV mtDNA entails adaptive metabolic features. Of the 13 protein-coding genes, MT-ND5 had the highest number of variants in plasma EVs from both colon and rectal cancer patients. MT-ND5 is the most frequently mutated mitochondrial gene in cancer [28]. It is evidence for negative selection of truncation mutations in the mtDNA genes, but for some malignancies, including CRC, the opposite has been shown with suggested functional oncogenic impact of mitochondrial mutations in the initiation and clonal evolution of the cancer [28]. In our study, the EV mtDNA displayed high diversity and several distinctive variants not found in the other samples, pointing towards a possible involvement of EVs in regulation of mtDNA heterogeneity of the patients. However, since we did not sequence samples from healthy controls and EVs are secreted by all cells, we cannot exclude that heterogeneous EV mtDNA variants originate from other tissues than the cancer cells. To truly answer this, further studies are needed.

The mtDNA contains three relevant classes of phenotypes; recent germline mutations, somatic mutations, and ancient adaptive polymorphisms. These variants appear within a cell with normal mtDNA generating a mixed mitochondria-containing cytoplasm of variant and reference mtDNA, a state known as heteroplasmy [5]. Detection of low-level (< 0.10) heteroplasmy has been important for diagnosis and prognostication of mitochondrial diseases, but also in cancer and age-related research [29]. In plasma EVs from both the colon and rectal cancer patients, low-level heteroplasmy variants were frequent. In order to sequence the complete mitochondrial genome in FFPE tissues, the mtDNA amplification necessitated multiple PCR primer sets. The apparent low number of variants in FF tumor compared to FFPE tumor suggested that the tumor tissue conservation method or mtDNA amplification procedure before sequencing might have had an impact on variant detection. Additional mtDNA variants might have been generated during the tissue formalin-fixation process in the form of nucleotide modifications such as G > A and C > T transitions, which previously have been suggested as resulting artifacts using this conservation [30, 31], as well as increased co-amplification of NUMTs [32, 33]. It can be possible to reduce such sequence information by experimental or bioinformatic methods, none of which is sufficiently standardized. Of note, the mtDNA copy number per cell can vary by several orders of magnitude depending on the cell type [34]. Our current strategy did not allow us to determine whether the variants came from the enrichment of mutant mtDNA or mtDNA copy number variations.

Although the origin of the EVs is unknown, in our study they showed similarities to the tumor tissues with the presence of nonsense mutations, and could possibly be involved in tumor cell signaling to adjust their metabolic needs. However, the cell has multiple pathways to recover mtDNA and maintain mitochondrial quality, including mtDNA repair, degradation, clearance, and release. Damaged mtDNA can be removed from the cells through EVs (by means of fragmented or the intact full genome), migrasomes, or other pathways of clearance, in order to maintain cell homeostasis (reviewed in [35]). The plasma EV mtDNA was abundant in coding region missense mutations, making it tempting to speculate that functionally detrimental somatic mtDNA mutations in cells can be expelled via EVs.

Limitations to this study includes the influence of sample storage and potential mtDNA amplification bias. Experimental factors [29] and especially contaminating NUMTs, generated by the transfer of mtDNA into the nuclear genome, can complicate mtDNA sequencing analysis [32, 33]. However, the long-range targeted PCR prior to sequencing circumvented this problem to some degree [36] and highlights snap-freezing as the more suitable conservation method for mitogenome analysis of solid tissues. Another limitation is the low number of patients selected mainly based on the availability of biobank materials, hampering a more thorough investigation of the biological relevance.

Conclusion

In conclusion, our investigations revealed that plasma EV mtDNA exhibits a high degree of diversity, suggesting involvement in CRC biology.