Introduction

Multiple myeloma (MM) is an incurable clonal plasma-cell malignancy that is more common in older individuals and accounts for approximately 10% of all hematological malignancies1. The standard of care for fit newly diagnosed patients in the United States is induction chemotherapy with a combination of a proteasome inhibitor and immunomodulatory drug (IMiD) followed by high dose melphalan chemotherapy and autologous stem cell transplantation (ASCT). This is typically followed by IMiD maintenance until disease progression. Not all patients benefit from ASCT and it can be associated with significant long-term toxicities including cytopenias and therapy-related myeloid neoplasms (TMN), a risk that is further increased by the use of lenalidomide maintenance2,3,4. However, the clinical benefit of lenalidomide maintenance as seen in improved overall survival (OS) clearly outweigh its risk. A better understanding of which patients may benefit from ASCT and maintenance therapy is an important question and active area of ongoing clinical investigation.

A number of recent studies have identified recurrent somatic mutations in the blood of otherwise healthy adults, a condition referred to as clonal hematopoiesis of indeterminate potential (CHIP)5. CHIP is associated with a 0.5–1% risk of progression to a non-plasma-cell hematologic neoplasm, in particular myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML)19, the presence of CHIP prior to ASCT was not associated with an increased risk of TMN (p = 0.4). However, IMiD maintenance was significantly associated with develo** a subsequent TMN (p = 0.047) (Fig. 2a, b) and the presence of CHIP did not increase the risk among those receiving first-line IMiD maintenance or IMiD maintenance at any point post-ASCT (Fig. 2c, d). Nine of the 21 patients who developed MDS/AML were still alive at time of analysis. Seven patients had a myeloma relapse before the diagnosis of MDS/AML. Of the 12 patients who died, 5 passed away from the TMN while 7 died from a combination of myeloma progression and MDS/AML.

Fig. 2: Outcome of immunomodulator maintenance and CHIP in the context of therapy related myeloid neoplasms (TMN).
figure 2

a Cumulative incidence of MDS/AML in patients with IMiD maintenance versus patients without maintenance, with death as an absorbing competing risk. Cumulative incidence of develo** MDS/AML among patients with CHIP vs. no CHIP, with death as an absorbing competing risk, b among all patients, c patients receiving first-line IMiD maintenance and d patients receiving IMiDs at any point post ASCT. Groups were tested for equality using a two-sided Gray’s test to compare subdistributions for each competing risk.

We next asked whether CHIP at the time of ASCT was clonally related to a subsequent MDS or AML. Sequential samples from 14 of the 21 TMN patients were available for targeted sequencing (Supplementary Tables 5 and 6). 10/14 patients had an identifiable driver mutation at the time of TMN diagnosis, most commonly in TP53, but only 4 of those patients had a detectable mutation at the time of ASCT. Among the 6 patients in whom we did not initially find CHIP, we were able to identify the TMN driver mutation in 4 of the ASCT samples at VAFs below our threshold of 0.01. Thus, in 8 of 10 patients with identifiable mutations at the time of TMN diagnosis, at least one somatic mutation could also be detected prior to ASCT. In most cases, the driver mutation present prior to ASCT had expanded at the time of TMN diagnosis (Supplementary Table 6). It is possible that with deeper sequencing pre-existing mutations could be found in these additional 2 TMN cases as well.

CHIP is associated with adverse outcomes

Having found no evidence for higher TMN risk in patients with CHIP, we examined whether CHIP was associated with other adverse outcomes. Out of 629 patients in our cohort, 376 patients had died at the time of analysis. The median OS and progression-free survival (PFS) of our cohort were 7.1 and 2.5 years, respectively. After stratification based on age, International Staging System (ISS) and number of treatment lines prior to ASCT, we modeled the association of CHIP with OS and PFS. The median OS of patients with CHIP was 5.3 years, significantly lower than in those without CHIP (7.5 years) (HR: 1.34, p = 0.02, stratified multivariable cox regression model) (Fig. 3a). Interestingly, CHIP was also associated with a lower median PFS of 2.2 years compared to 2.6 years in those without CHIP (HR: 1.45, p < 0.001, stratified multivariable cox regression model) (Fig. 3b). We also observed a worse OS and PFS in patients carrying more than one mutation (Supplementary Fig. 5).

Fig. 3: Multivariable cox regression model of OS and PFS.
figure 3

a OS and b PFS models for all 629 patients after stratifying by age, ISS and number of lines of therapy prior to ASCT to investigate the effect of CHIP and IMiD maintenance on outcome. Two-sided Wald p-values are shown for each model coefficient with significant effects displayed in red. Exact p-values: A: 0.0197276361 and 0.0000692041; B: 0.0007310355 and 1.547737 × 10−15. HR: Hazard Ratio; LCI: Lower Confidence Interval; UCI: Upper Confidence Interval.

Unlike prior studies of CHIP, we did not observe excess deaths related to cardiovascular disease or stroke, possibly due to the aggressive nature of multiple myeloma8,19. The most common cause of death was myeloma disease progression followed by respiratory failure and sepsis. Interestingly, patients with CHIP responded less well to induction therapy such that CHIP was associated with a higher post-induction median level of β2-microglobulin (2.3 mg/dL in those with CHIP compared to 2.0 mg/dL in those without [p = 0.008]), and a smaller percentage decrease in M-spike level (p = 0.008) post-induction as compared to diagnosis (Supplementary Tables 79).

IMiD maintenance is associated with improved outcomes in all patients

Treatment with IMiDs has been reported to increase the risk of secondary malignancies including MDS and AML2,3,4. Almost all patients received thalidomide or lenalidomide at some point throughout the course of their disease. Therefore, we asked whether treatment with IMiDs was associated with adverse outcomes in patients with CHIP. Only 57% of the patients in our cohort received first-line IMiD maintenance, with 22% receiving it for at least 3 years (range: 0.1–14.9). First-line IMiD maintenance was associated with a longer median OS of 8.5 years compared to 5.6 years in those not receiving IMiD maintenance [HR: 0.65 (0.52–0.80), p < 0.001] (Fig. 3a). As expected, IMiD maintenance was associated with a longer PFS of 3.4 years compared to 1.5 years in those not receiving IMiD maintenance [HR: 0.47 (0.39–0.56), p < 0.001] (Fig. 3b). Only 16 patients received proteasome inhibitor-based maintenance, too few to draw significant conclusions (Supplementary Fig. 6).

In patients not receiving IMiD maintenance, CHIP was associated with a significantly lower median OS of 3.6 years compared to 6.6 years in those who did not have CHIP mutations (p = 0.013, stratified analysis). However, there was no significant difference in those who received maintenance (median OS of 7.7 and 8.9 years for CHIP and no CHIP, respectively, p = 0.49, stratified analysis) (Fig. 4a). Similarly, in patients not receiving IMiD maintenance, CHIP was associated with a significantly lower median PFS of 1.1 years compared to 1.8 years (p < 0.001, stratified analysis). There was also no difference in PFS among patients with and without CHIP who received IMiD maintenance (median PFS of 3.3 and 3.6 years for CHIP and no CHIP, respectively, p = 0.59, stratified analysis) (Fig. 4b). These results suggest the presence of an interaction between the effect of having CHIP and IMiD maintenance, which is significant when it comes to PFS outcome [HR: 0.51 (0.34–0.79), p = 0.002] (Supplementary Fig. 7).

Fig. 4: Overall survival and progression free survival of patients with respect to IMiD maintenance and CHIP.
figure 4

a OS and b PFS among patients with CHIP versus those without CHIP in the context of receiving versus not receiving IMiD maintenance post ASCT. Overall and pairwise two-sided log-rank p-values are shown unadjusted for multiple testing.

We next asked whether mutations in specific genes were associated with worse outcomes. The two most commonly mutated genes were DNMT3A and TET2, which were associated with a significantly reduced PFS and OS as compared to patients without CHIP in the absence of IMiD maintenance (Supplementary Fig. 8). In particular, patients with the p.R882 DNMT3A mutation had the worst median OS of 1 year (p = 0.008, stratified analysis) and median PFS of 0.9 years (p = 0.007, stratified analysis) compared to patients without CHIP (Supplementary Fig. 9). However, in patients who received IMiD maintenance, the decrease in OS and PFS seen in p.R882 patients was completely abrogated.

Altogether, these data suggest that the presence of CHIP at time of ASCT does not increase the risk of TMN associated with IMiD maintenance and that patients with CHIP, when treated with IMiD maintenance, obtain a survival benefit similar to that seen in MM patients generally.

Discussion

Here we report the first study to date examining the relationship between CHIP and clinical outcomes in MM, involving over 600 patients undergoing ASCT at a single center with a median follow-up of 9.7 years. The frequency of CHIP in this cohort is lower than that seen in patients with relapsed NHL undergoing ASCT and had a mutational spectrum more similar to that seen in healthy adults7,8,19. The differences observed between CHIP in MM and NHL are potentially related to the shorter duration of chemotherapy exposure in patients receiving induction therapy for MM and less use of DNA damaging agents that may select for specific mutant clones. The mutations seen most frequently in patients with relapsed NHL are found in PPM1D and TP53, two genes known to play important roles in the response to DNA damage and chemotherapy resistance30,31,32.

TMN is one of the most feared complications of treatment for MM and ASCT, in particular, and the presence of mutant hematopoietic clones has been proposed to herald its development. However, we did not observe an increased risk of TMN in patients with CHIP undergoing ASCT. This may be because the mutations selected for by MM induction therapy do not carry a high risk of subsequent myeloid malignancy14,15. Consistent with this hypothesis, the most common mutations in those who developed a TMN were in TP53, which was found relatively infrequently (2.9%) in our cohort at time of ASCT. Further prospective studies are needed to determine whether the presence of TP53 or other specific mutations at the time of ASCT carry a high risk of TMN development and thus might warrant avoidance of ASCT in MM patients harboring them.

We detected CHIP in 21.6% of MM patients at the time of ASCT and found it to be associated with both decreased OS and PFS. In contrast to previous reports, this was not due to an increased risk of TMN or cardiovascular disease19. Unlike in NHL where ASCT is a potentially curative therapy and many patients may expect a long-life expectancy after transplant, MM almost uniformly relapses and may provide insufficient time for cardiovascular disease to manifest. As treatment for MM continues to improve with a concomitant increase in patient life expectancy the potential long-term non-hematopoietic risks associated with CHIP will need to be evaluated further.

Surprisingly, the primary effect of CHIP on survival was due to an increased risk for myeloma progression. Consistent with this finding, patients with myeloma and CHIP had a higher level of β2-microglobulin and a smaller percentage decrease in their M-protein post induction, compared to those without CHIP. There are several potential mechanisms by which CHIP could promote increased progression of MM. It is possible that patients with CHIP are more prone to the development of cytopenias and other toxicity from therapy, thus increasing the frequency of treatment delays or dose reductions and limiting their ability to receive optimal myeloma directed therapy. Alternatively, the presence of CHIP could alter the bone marrow (BM) microenvironment in such a way as to promote MM progression. Myeloid cells carrying mutations in TET2 and DNMT3A have been reported to stimulate inflammation through upregulation of IL-1β and IL-611,33. Whether a hyperinflammatory phenotype within the BM niche might favor the growth of MM cells and promote more aggressive disease is a prospect that will require further investigation in both animal models and patients.

IMiD maintenance has demonstrated a clear survival advantage in the post-ASCT setting2,3,4. However, it has also been associated with an increased risk of TMN and mutations in TP53 have been reported to promote clonal expansion and the development of resistance in del(5q) MDS treated with lenalidomide34,35. While induction with lenalidomide could have led to the relative enrichment of TP53 at time of ASCT, we surprisingly saw no increase in TMNs in patients with CHIP and the use of IMiD maintenance. In fact, IMiD maintenance was not only associated with an improvement in both PFS and OS but it completely abrogated the deleterious effects of CHIP in the post-ASCT setting. The findings from this study are limited by the fact that it is retrospective and includes patients treated during the introduction of IMiD maintenance and thus not all patients received maintenance therapy. Follow-up studies examining the large randomized studies of placebo vs. lenalidomide maintenance could further define the role of IMiDs in the survival of patients with CHIP. In addition, because all patients received ASCT, the role of high dose melphalan and stem cell transplant cannot be fully dissected in this study and warrants further investigation within trials comparing clinical outcomes of upfront vs. delayed/salvage ASCT or transplant-based vs. drug-based consolidation36,37,38,39,40.

In summary, we found CHIP to be a common entity among MM patients undergoing ASCT. The presence of CHIP was associated with worse outcomes and thus, it would be tempting to screen newly diagnosed MM patients for CHIP before ASCT. However, our data suggest that ASCT, when followed by IMiD maintenance, can be safely utilized regardless of CHIP status. Further well-controlled prospective clinical trials are needed to investigate the interaction between CHIP, transplant and IMiDs on outcomes in multiple myeloma.

Methods

Cohort

Following institutional review board (IRB) approval, we collected the clinical data and all available cryopreserved products of mobilized autologous stem-cells from 629 MM patients who underwent ASCT between January 2003 and December 2011 at the Dana-Farber Cancer Institute (DFCI) in Boston, MA. The cutoff date of 2011 was used to enable enough years of follow up following stem cell transplantation and allow for the monitoring of TMNs and survival data. Clinical information was collected through November 2019. The study design complied with the Declaration of Helsinki and International Conference on Harmonization Guidelines for Good Clinical Practice. All subjects previously provided written informed consent to allow the collection of clinical information and genetic analysis of PB and BM samples for research purposes (DF/HCC IRB 01–206, 07–150 and 16–529).

While all 629 patients received ASCT, 21 of those patients received tandem ASCT, and three received a second ASCT at a later time point. Also, 38 patients received allogenic stem cell transplant, seven of which were tandem, and the rest got them at a later time point post relapse.

Genomic studies

Deep targeted sequencing was performed on the stem-cell products of 629 MM patients, as well as on available samples of PB and BM aspirates obtained from 15 patients at the time of pre-mobilization and when they developed a hematologic second primary malignancy post-ASCT, respectively. A custom target bait panel of 224 genes was used, including, pan-cancer, myeloma and myeloid malignancy-associated genes (Agilent SureSelectXT hybrid capture system). See Supplementary Table 1 for a list of genes and coordinates. Libraries for the stem cell products were constructed automatically, using the Agilent Bravo robot, and were sequenced on the Illumina HiSeq 4000 platform in pools of 32 samples, achieving a 978X total depth of coverage. Libraries of PB and BM samples were constructed manually and were sequenced on the Illumina HiSeq 2500 platform in pools of 24, achieving 556X total depth of coverage. To detect large-scale copy number alterations that reflect tumor cells within the stem cell products, we also performed ULP-WGS at an average genome-wide fold coverage of 0.1×. Detailed information on library preparation, sequencing platforms and computational analysis for targeted sequencing and ULP-WGS is provided in the supplementary information.

Sequencing was done at the Broad Institute, Cambridge, MA, USA.

Computational analyses

Sequencing data was analyzed using the pipelines of the Broad Institute of Harvard and MIT (Firehose, www.broadinstitute.org/cancer/cga). To estimate the presence of tumor, we performed ultra-low pass whole-genome sequencing (ULP-WGS) of all samples to an average genome-wide fold coverage of 0.1×41. The depth of coverage was determined using ichorCNA41,42,43, to estimate large-scale copy number alterations (CNAs) and the fraction of tumor in ULP-WGS. Low coverage samples (<0.05×) were manually reviewed to determine tumor fraction. All samples had a low tumor fraction (3–5.4%).

The targeted sequencing data of our samples were aligned using BWA-mem and the base qualities of the aligned data were re-calibrated using GATK3 Base Quality Score Recalibration (BQSR)44,45. We have utilized the Getz Lab CGA WES Characterization pipeline (https://github.com/broadinstitute/CGA_Production_Analysis_Pipeline) developed at the Broad Institute to call, filter and annotated somatic mutations and copy number variation. We modified this pipeline to call blood samples without matched controls. Hence, we employed the following tools: MuTect46, Strelka47, Orientation Bias Filter48, MAFPonFilter49, RealignmentFilter, ABSOLUTE50, GATK MuTect251, PicardTools51,52, Variant Effect Predictor53, and Oncotator54.

Usually the variant allele frequency (VAF) cutoff to call mutations is set at 0.02, below which it would be difficult to distinguish somatic mutations from contamination by other samples and sequencing artifacts. Thus, we aimed at estimating contamination to assure that the mutations are biologically relevant even when going below 0.02. We present a framework that allows one to include smaller clones in CHIP calling as a function of the single-sample contamination, providing greater resolution into the initial claim that CHIP is the presence of characteristic driver gene mutations in hematopoietic cells that occurs at a VAF of at least 0.02. This distinction of clonal mutations from contamination becomes especially important for those samples whose sample-to-sample contamination is in fact greater than 2% such that a mutation with a VAF of 0.02 would not be considered in our analysis if the sample’s contamination was about 3% (the range of contaminations in our samples was from roughly 1–8%). However, given our high depth of coverage (978×) we were able to confidently call mutations, distinct from sample-to-sample contamination and sequencing artifacts.

To estimate contamination of single samples, we used VerifyBamID55 using the ExAC56 VCF to test for germline SNPs with a minimum allele fraction of 0.25. In order to control for noise/artifacts with indel calling, we selected the youngest sample with no detectable CHIP mutations to use as an unmatched control for Strelka and MuTect2. Variants were classified as pathogenic driver mutations based on mutation type, position, and frequency in published reports8,19,57 and public databases58. The set of rules to consider a queried mutation as a “driver” mutation is outlined in supplementary table 2. The minimum number of alternate reads we chose to accept or reject to call a variant according to MuTect2 was not a hard cutoff but rather one determined as a function of the read depth, strand biases and contamination at a given site46. Consequently, the lowest number of accepted alternate reads was 4. Variants with allele fraction less than 0.01 were excluded and, except for DNMT3A, TET2, ASXL1, PPM1D, TP53, JAK2, SF3B1, SRSF2, mutations with VAF above 0.35 were also excluded since these often represent germline polymorphisms. To further confirm that our called mutations are somatic drivers, we excluded single nucleotide variant (SNV) mutations with a TLOD score below 6.3 (via MuTect2) and Insertion-Deletion (Indel) mutations with a QSI_NT score below 30 (via Strelka). As a consequence of bait preferences, some germline mutations can sometimes have putatively somatic allele fractions if they are in regions of poor map**. However, because Strelka and MuTect2 perform local realignment, these callers do not have deflated allele fractions in contrast to other callers that double count reference reads if they detect split reads, which is a consequence of structural variation. Finally, all variants were visually inspected in Integrated Genome Viewer (IGV)59.

We compared sequential samples taken from the same patient, in search for the presence of a mutation, called in one sample, in another sample from the same patient. To do that, we performed force-calling which is a technique that looks at the reads in a BAM file from a list of genomic coordinates and calculates the number of reads supporting an alternate allele at that location60.

Statistical analyses

OS was defined as the time from transplantation until death from any cause, with censoring at time last known to be alive. PFS was measured from ASCT to the date of disease progression or death from any cause, censoring at time last known to be alive and progression-free. Survival curves were estimated using the Kaplan Meier method, with variance and CIs estimated using Greenwood’s formula. Stratified Cox regression was used for time-to-event outcomes; hazard ratios (HR) and 95% confidence intervals (CI) were reported. Stratification was based on age (age groups: 20–29, 30–39, 40–49, 50–59, 60–69, 70–79), ISS of the disease61 and number of lines received prior to ASCT. p-Values were two-sided, and those <0.05 were considered statistically significant. All data were analyzed using R version 3.5.0 (R Core Team).

Death and occurrence of a TMN (namely, MDS and AML) were modeled as competing events. Wilcoxon rank-sum and Fisher’s exact tests were used for CHIP association with continuous and categorical variables, respectively. Ordinal variables with three or more groups were tested for association with CHIP using a Kruskal-Wallis test for singly-ordered contingency tables.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.