Abstract
Objectives
Response assessment to neoadjuvant systemic treatment (NAST) to guide individualized treatment in breast cancer is a clinical research priority. We aimed to develop an intelligent algorithm using multi-modal pretreatment ultrasound and tomosynthesis radiomics features in addition to clinical variables to predict pathologic complete response (pCR) prior to the initiation of therapy.
Methods
We used retrospective data on patients who underwent ultrasound and tomosynthesis before starting NAST. We developed a support vector machine algorithm using pretreatment ultrasound and tomosynthesis radiomics features in addition to patient and tumor variables to predict pCR status (ypT0 and ypN0). Findings were compared to the histopathologic evaluation of the surgical specimen. The main outcome measures were area under the curve (AUC) and false-negative rate (FNR).
Results
We included 720 patients, 504 in the development set and 216 in the validation set. Median age was 51.6 years and 33.6% (242 of 720) achieved pCR. The addition of radiomics features significantly improved the performance of the algorithm (AUC 0.72 to 0.81; p = 0.007). The FNR of the multi-modal radiomics and clinical algorithm was 6.7% (10 of 150 with missed residual cancer). Surface/volume ratio at tomosynthesis and peritumoral entropy characteristics at ultrasound were the most relevant radiomics. Hormonal receptors and HER-2 status were the most important clinical predictors.
Conclusion
A multi-modal machine learning algorithm with pretreatment clinical, ultrasound, and tomosynthesis radiomics features may aid in predicting residual cancer after NAST. Pending prospective validation, this may facilitate individually tailored NAST regimens.
Clinical relevance statement
Multi-modal radiomics using pretreatment ultrasound and tomosynthesis showed significant improvement in assessing response to NAST compared to an algorithm using clinical variables only. Further prospective validation of our findings seems warranted to enable individualized predictions of NAST outcomes.
Key Points
• We proposed a multi-modal machine learning algorithm with pretreatment clinical, ultrasound, and tomosynthesis radiomics features to predict response to neoadjuvant breast cancer treatment.
• Compared with the clinical algorithm, the AUC of this integrative algorithm is significantly higher.
• Used prior to the initiative of therapy, our algorithm can identify patients who will experience pathologic complete response following neoadjuvant therapy with a high negative predictive value.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Neoadjuvant systemic treatment (NAST) is the standard treatment for patients with early breast cancer because it allows response monitoring and tumor down-staging [1]. Patients who achieve a pathological complete response (pCR) have significantly better survival compared to non-pCR patients [2]. Understanding the likelihood of an individual will achieve pCR prior to the initiation of therapy could facilitate individually optimized NAST regimens.
The application of machine learning in medicine has developed rapidly in recent years [3]. Predicting tumor response to NAST in breast cancer has been explored in multiple radiomics studies [4,5,6]. Radiomics is a tool that can extract image features and present them numerically [7]. Currently, radiomics-based algorithms showed promising results in predicting breast tumor response but with certain limitations: (1) high performance is often seen for algorithms that use radiological examinations after/close to the completion of NAST, which limits the clinical application of the predictive algorithm; (2) most studies [8,9,10] investigated single-modality imaging radiomics which does not represent the integrative multi-modality imaging process in clinical routine (mainly ultrasound and mammography/tomosynthesis); (3) though MRI-based radiomics models showed satisfied results [9], MRI examinations are not routinely used for every patient due to contraindications and economic reasons [11]; (4) a lack of clearly reported, standardized imaging processing, which has a large effect on model performance and generalizability; (5) tomosynthesis has recently shown better performance compared to mammography in screening women with extremely dense breasts and at high risk of breast cancer [12], but the performance of tomosynthesis-based radiomics algorithms in response assessment to NAST remains unknown.
In this study, we aimed to develop and compare intelligent algorithms using multi-modal pretreatment ultrasound and tomosynthesis radiomics features in addition to clinical variables to predict pCR in breast cancer prior to the initiation of therapy.
Methods
Study design
This single-center and retrospective study was in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of Heidelberg University Medical Faculty (S-092/2022).
In this study, we aimed to develop and compare intelligent algorithms using pretreatment ultrasound and tomosynthesis radiomics features in addition to clinical variables to predict response to NAST in breast cancer before the initiation of therapy. We considered three different types of input variables: clinical variables, ultrasound radiomics, and tomosynthesis radiomics. A full list and definition of clinical variables are detailed in Supplemental Table S2. Thus, we evaluated andcompared different models based on their input variables:
-
Only clinical variables.
-
Clinical variables and one-view ultrasound radiomics with peritumor information.
-
Clinical variables and double-view ultrasound radiomics with peritumor information.
-
Clinical variables and tomosynthesis radiomics.
-
Clinical variables and tomosynthesis radiomics with peritumor information.
-
Integrative, multi-modality model using clinical, ultrasound, and tomosynthesis radiomics including peritumor information.
Patient selection
The inclusion criteria were as follows:
-
(1)
Patients with pathologically proven diagnosis of breast cancer.
-
(2)
Underwent neoadjuvant systemic treatment.
-
(3)
Underwent mammography tomosynthesis and ultrasound examination before neoadjuvant systemic treatment at Heidelberg University Hospital.
-
(4)
Without distant metastasis at the time of diagnosis.
-
(5)
Any tumor biology.
The exclusion criteria were as follows:
-
(1)
Combined with other tumor disease.
-
(2)
Aged <18 years.
Patients’ ultrasound and tomosynthesis images were acquired by experienced physicians specialized in breast diagnostics using Siemens machines (for ultrasound Siemens S2000 and S3000, for tomosynthesis Novation and Inspiration). The clearest double view of ultrasound images (view with largest diameter and 90° view) was documented, and one slice of tomosynthesis image in mediolateral oblique (MLO) view and mediolateral (ML) view with largest tumor was selected and documented. All images were stored in Digital Imaging and Communications in Medicine (DICOM) format. The corresponding clinical variables were documented from patients’ medical records (Table 1).
Image processing
-
1)
Histogram matching
We used histogram matching to maintain consistency of images acquired by different types of machines and settings [13], and one normal ultrasound image and one slice of normal tomosynthesis image were selected for histogram matching.
-
2)
Segmentation
We used the open-source software 3D slicer (4.13.0-2022-04-01) for segmentation, the outline of tumors was delineated semi-automatically, and the 3-mm peritumor spaces were generated by using the “Hollow” effect in 3D slicer [14]. Figure 1 shows examples of segmentation in an ultrasound image and a tomosynthesis image.
-
3)
Re-segmentation, discretization, and feature extraction
Re-segmentation and discretization were done at the same time when doing feature extraction. Re-segmentation was performed to remove pixels from the segmented region that fall outside of the specified range of gray levels to reduce errors caused by manual delineation [15]. Discretization is conceptually equivalent to the creation of a histogram to make feature calculation tractable [16]. They were shown as parameters of feature extraction on the practical level. We used the most common parameter μ±3σ for re-segmentation [15, 17] (μ stands for the mean value of gray levels and σ stands for the standard deviation). The optimal number of bin widths for image discretization is still unclear [18]; we set the bin width as 10 for discretization. We used the open-source software Pyradiomics for feature extraction [19].
-
4)
Feature selection
We used Pearson’s correlation coefficient matrix (PCCM) and recursive feature elimination (RFE) embedded within the 10-fold of cross-validation on the development set for feature selection. First, PCCM was applied to identify multicollinearity between features; only one feature was preserved of any pair with a correlation coefficient of more than 0.85 or less than −0.85 [20]. Second, RFE was applied to further reduce the number of radiomics features on the development set [21] [22].
Outcome and definitions
Pathological evaluation of the surgical specimen served as gold standard for the definition of pCR. We assumed pCR if no residual invasive or in situ tumor cells were found in the breast and axillary lymph nodes (ypT0 and ypN0). Details are shown in Table 1.
Model construction and evaluation
For the algorithm development and reporting, we considered guidelines on machine learning in medicine [23], diagnostic tests [24], and multivariable prediction models [25]. A checklist informed by recent guidelines on machine learning in medicine is provided in the Data Supplement.
We chose a supporting vector machine (SVM) algorithm for model construction due to its known characteristic of considering non-linear inter-feature relationships [26]. We randomly split the whole cohort into a development set (504 of 720, 70.0%) and a validation set (216 of 720, 30.0%).
Ten-fold cross-validation was used for the algorithm training and internal testing on the development dataset. A hypergrid-search was performed to select the optimal hyperparameters. False-negative rate (FNR) was considered the main measurement of model performance. The risk threshold for the binary outcome prediction was chosen at 90% sensitivity in the development set by maximizing the metric with 1000 times bootstrap replicates. The final integrative multi-modal model with an optimized threshold was then validated using the validation set. Figure 2 illustrates the cutoff chosen for the final integrative multi-modal model in the development set.
We calculated additional diagnostic metrics like area under the receiving operator curve (AUC), accuracy, specificity, sensitivity, false-positive rate (FPR), positive predictive value (PPV), and negative predictive value (NPV). We used the “DALEX” package in R to calculate the agnostic variable importance measure computed via permutation (e.g., computing the loss function for the full model and then computing randomized response variables’ loss function). We used decision curve analysis (DCA) to better illustrate the benefits of clinical application of the models [27]. Python (Version 3.9.7) and R (Version 4.2.1) were used for all analyses.
Statistical analysis
We performed a descriptive analysis to illustrate the distribution of the baseline characteristics of the development set and validation sets. We used the chi-square test for categorical data, and the t-test for continuous data to compare differences in baseline characteristics between the development and validation set. We calculated area under the receiver operating characteristic curve and accompanying 95% CIs for the algorithms using 2000 bootstrap replicates stratified for the outcome variable (non-pCR, ypT+, and/or ypN+). The Venkatraman method tests were used to compare models’ performance [28]. A proportion test was used to compare the model’s diagnostic performance [29]. Calibration plots (observed vs. predicted probabilities) and Spiegelhalter’s Z statistics were used to evaluate model calibration [30, 31].
p values < 0.05 were considered significant.
Results
Patient flow
Of 1643 patients who underwent neoadjuvant systemic treatment from 2010 to 2020 at Heidelberg University Hospital, 75 were excluded because of distant metastasis, 768 were excluded because they did not undergo pretreatment ultrasound and/or tomosynthesis examinations at our institution, and 80 were lost due to technical issues (not transferable into image analysis software or double-view ultrasound images saved side-by-side within one image instead of two separate images). The remaining 720 patients were analyzed in this study (Fig. 3).
Baseline characteristics
Of 720 patients, 33.6% (242 of 720) achieved pCR. Comparing the development and validation sets, more patients in the development had ER-positive tumors (60.1% vs. 52.1%, p = 0.046). Details regarding baseline clinical characteristics are shown in Table 1. pCR rates according to breast cancer subtype are displayed in Table 2. Her-2 over-expression subtype achieved the highest pCR rate in the whole cohort (49 of 79, 62.0%) and the development set (35 of 51, 68.6%).
Feature selection
Per segmentation, 130 features were extracted, resulting in a total of 780 features for one patient with double-view ultrasound and tomosynthesis, with tumor as well as peri-tumor segmentation. After removing non-numeric features by applying PCCM, 22 ultrasound radiomics features and 33 tomosynthesis radiomics features were preserved. Finally, 23 features were selected by RFE, detailed in Table S3. The final model features are provided in Table S4.
Model performance
Figure 4 shows the comparison between the different SVM models: the clinical model, one-view ultrasound model, two-view ultrasound model, tomosynthesis tumor radiomics model, tomosynthesis tumor plus peritumor radiomics model, and the integrative model with multi-modal clinical, ultrasound, and tomosynthesis radiomics features. The multi-modal model and the model with tomosynthesis tumor plus peritumor radiomics features had significantly higher performance in predicting tumor response to NAST compared to the clinical model (AUC 0.81, 95% CI 0.75–0.87 and AUC 0.79, 95% CI 0.72–0.85, respectively, vs. 0.72, 95% CI: 0.65–0.78; p = 0.007 and p = 0.03). The rest of the models’ AUC values were improved without statistical significance (Table S4). When ypT0/is, ypN0 was used as endpoint definition, and the integrative multi-modal model performance was AUC 0.78 (95% CI 0.71–0.85; see Table S6).
With an eye to reliably excluding residual cancer after NAST, the multi-modal model revealed a significantly lower FNR of 6.7% (10 of 150 patients with missed residual cancer), compared to the clinical model (14.0%, 21 of 150, p = 0.016). Table 3 shows the diagnostic performance metrics of the clinical model and multi-modal model.
Table 4 shows the matrix of the clinical model and multi-modal model as well as AUC values by tumor biologic subtype. The luminal subtype achieved the highest AUC of 0.83 (95% CI: 0.75–0.91) and the TNBC subtype achieved the lowest AUC (0.71, 95% CI: 0.57–0.83).
Insights into model predictions
Table 5 illustrates the clinical univariable and multivariable logistic regression results of non-pCR versus pCR. Upon performing multivariable logistic regression, Ki-67 (odds ratio [OR] 0.99; 95% CI, 0.98 to 1.00, p = 0.003), perimenopause status (OR 0.54; 95% CI, 0.31 to 0.95, p = 0.032), positive estrogen receptor (ER) status (OR 1.82, 95% CI, 1.15 to 2.89, p = 0.011), positive progesterone receptor (PR) status (OR 2.14, 95% CI, 1.35 to 3.40, p = 0.001), and positive HER-2 status (OR 0.32, 95% CI, 0.22 to 0.47, p < 0.001) were significantly associated with non-pCR after NAST.
Figure 5 illustrates insights into the variable importance for the predictions made by the multi-modal SVM model. The five most important variables were tomosynthesis tumor original shape surface volume ratio, ER status, HER-2 status, ultrasound tumor original gray level size zone matrix (GLSZM) zone entropy, and PR status.
Figure 6 shows the decision curve analysis of the integrative multi-modal model and the clinical model. Net benefits of the two models and the default approaches of treating all (always act) patients or treating none (never act) patients are shown. From 0.29 to 1.0 threshold probabilities, the integrative multi-modal model has the highest net benefit.
Model calibration
Figure S2 illustrates the calibration plot of the multi-modal SVM model; Spiegelhalter’s z indicates a well calibrated model (z = 0.2301, p = 0.409).
Discussion
We developed and compared intelligent algorithms using multi-modal pretreatment ultrasound and tomosynthesis radiomics features in addition to clinical variables to predict response to NAST in breast cancer prior to treatment initiation. The integrative, multi-modal algorithm showed significant improvement in assessing response to NAST compared to an algorithm using clinical variables only (AUC 0.81, 95% CI 0.75–0.87 vs. AUC 0.72, 95% CI 0.65–0.78, p = 0.007) with a FNR of 6.7% (10 of 150 patients with missed residual cancer in the surgical specimen, ypT+ or ypN+). To our knowledge, this is the first study to use multi-modal radiomics features from different examinations to create predictions prior to treatment. Our study strictly follows the Image Biomarker Standardization Initiative (ISBI) guideline [15], and presents transparent parameters of image processing (i.e., histogram matching, image re-segmentation, and discretization).
Individualized treatment for breast cancer patients undergoing NAST has been a research priority over the past decade. Although up to 60% of patients achieve pCR (depending on tumor size and biology) [32], every patient currently has to undergo surgery due to the lack of tools to reliably exclude residual cancer prior to surgery. A recent single-center study reported the first oncologic outcomes for the omission of breast surgery using a vacuum assisted biopsy (VAB) performed after NAST in patients with strict inclusion criteria (cT1-2, cN0-1, triple-negative or HER-2 positive, residual lesion < 2 cm on imaging after NAST): There was no ipsilateral recurrence at a follow-up of 26.4 months [33]. However, the use of VAB previously showed high FNR in a multi-center setting [34]. Recently, a multicenter, intelligent VAB algorithm showed a FNR of 0.0–5.2% [35]. Our present study showed comparable results (FNR: 6.7%, 10 of 150) without the use of an additional biopsy procedure and with only pretreatment information.
Expanding on this clinical background, potential new pathways for the addressed patients following NAST are imaginable: The real-world scenario currently directs all patients to surgery following NAST and accepts high rates of overtreatment (surgery) for histological negative patients, but avoids undertreatment using the integrative multi-modal model. All test-positive patients might be directed to surgery, resulting in overtreatment of false-positive patients (41/181; 22.6%; false positives), which is however still lower compared to the current practice (100% undergo surgery). All test-negative patients might be directed to extended non-invasive biopsy. Undertreatment of false-negative patients (10/35; 28.5%) must be avoided and might be prevented by extended imaging-guided vacuum-assisted biopsy of the tumor bed or radiation therapy and omitting surgery. Patients with positive biopsy results would need to be directed to surgery. Finally, 11.6% (25/216) would benefit from this de-escalating concept; this proportion is in line with recent paradigm shifts in locoregional breast cancer management [36]. It should be noted, however, that the NPV of 71.6% means that 28.4% of patients who have been told a negative (tumor-free) test result might skip surgery although there is actually residual cancer left. Notably, past surgical de-escalation strategies in breast cancer were based on the FNR, as the FNR is independent from the prevalence in the respective population.
Many studies have tried to build radiomics models to predict tumor response to NAST, but their performances and qualities vary [37]. In terms of performance, features extracted from multiple times of examinations performed better, with AUCs ranging from 0.86 [38] to 0.94 [5], but require patients to undergo several examinations (i.e., pretreatment, early treatment—after completion of two [38] and/or four cycles [8] of NAST—and post-treatment). This requires a high degree of patient compliance and consumes a great deal of effort by physicians in clinical application. In terms of quality, some studies extracted features from a single timeframe of examination but not reported a specific time [39,40,41]. Other studies developed models only with pretreatment radiomics features, with performances ranging from 0.79 [42] to 0.92 [43]. However, sample sizes remained limited [25] (development set up to 362 patients [43]).
The peritumor space is considered to be highly related to the tumor microenvironment and plays an important role in the process of tumor angiogenesis and proliferation [44]. Radiomics studies based on MRI [9], ultrasound [10], and mammography [39] demonstrated that peritumor space can provide complementary information for predicting tumor response. But the optimal width of peritumor space remains controversial, with some studies suggesting that wider peritumor space (10 mm) performed worse than narrower space (5 mm) [39]. Few studies investigated the efficacy of peritumor space on tomosynthesis. In our study, we extracted features from 3 mm peritumor space, and the performance of the tomosynthesis tumor plus peritumor model improved but without statistical significance compared to the tomosynthesis tumor-only model.
There is an ongoing discussion about whether radiomics or deep learning analyses should be preferred for the analysis of medical images. Deep learning analyses often show higher performance and require less human work during the image processing; however, they lack interpretability. Radiomics, on the other hand, requires time-consuming, (semi-)automatic image segmentation, but allows for some interpretability of the model [4, 5, 8]. In our study, 2 radiomics features ranked in the top 5 among all variables: first, original surface volume ratio (SA:V) of tumor in tomosynthesis. The higher the SA:V, the more likely to have residual cancer after NAST. This may indicate that patients with more compact (sphere-like) shaped tumors on tomosynthesis might have higher chances of reaching pCR (e.g., triple-negative tumors) [45], while patients with irregular-shaped, crab-like, and polygonal tumors have lower chances to reach pCR (e.g., luminal tumors). Second, original GLSZM zone entropy of tumor in ultrasound images. GLSZM zone entropy measures the uncertainty/randomness in the distribution of zone sizes and gray levels. A higher value indicates more heterogeneity in the texture patterns [19]. This may indicate that breast tumors with heterogeneous echo on ultrasound images have lower chances to reach pCR.
This study has limitations. First, this is a retrospective, single-center study. Potential selection bias might have affected our findings, as a relevant number of patients who did not undergo imaging at our institution were excluded. Another source of bias arises from the unitary ethnographic information, since, e.g., Asian women tend to have denser breasts [46], which might have a negative influence on the model’s generalizability [47, 48]. Second, our findings will have to be replicated on images taken on different ultrasound and tomosynthesis machines to ensure generalizability of the algorithms. A prospective, multicenter study is required to further validate our findings. Third, tomosynthesis allows for digital reconstruction in 2 planes but not for 3D reconstruction. Thus, a single slice of tomosynthesis planes was analyzed in this study. Future research may look into automatically analyzing video clips of tomosynthesis to capture the full potential of tomosynthesis. Fourth, our analysis spans over a large timeframe from 2010 to 2020, patients underwent a variety of NAST, and the standard of care has changed during these times. As our study focuses on pre-treatment ultrasound images, we do not expect that the response of different NAST on imaging influences our models but we acknowledge that response to NAST has much improved with the use of modern NAST regimens [32, 49]. Thus, the issue of changing in- and output parameter over time might be a point of attention for further research, also when implementing such models in the future in clinical practice. Sixth, different definitions for pCR exist (ypT0, ypN0 vs. ypTis, ypN0). While most guidelines allow residual in situ disease to be considered a complete response, our present study was performed with an eye to potentially exclude residual cancer early to reduce surgical management. Thus, also, in situ disease must be excluded (indication for surgical resection) which is why we chose this endpoint, in line with previous research on this topic [35]. We provided a comparison of the integrative multi-modal model’s performance on different definition of pCR (ypT0, ypN0 vs. ypT0/is, ypN0) in Table S6. Seventh, ultrasound presents an inherent inter-rater variability which also applies to radiomics-based ultrasound analysis. Thus, future studies are needed to confirm reproducibility of the features. In order to minimize feature bias during the radiomics analysis, we used fixed bin widths for image discretization and outlier removal techniques for re-segmentation, which complies with recent guidelines and other research in that area [15, 50].
Conclusion
We developed and compared intelligent algorithms using multi-modal pretreatment ultrasound and tomosynthesis radiomics features in addition to clinical variables to predict response to NAST in breast cancer prior to treatment initiation. The integrative, multi-modal algorithm showed significant improvement in assessing response to NAST compared to an algorithm using clinical variables only (AUC 0.81, 95% CI 0.75–0.87 vs. AUC 0.72, 95% CI 0.65–0.78, p = 0.007) with a FNR of 6.7% in the validation set (10 of 150 patients with missed residual cancer in the surgical specimen, ypT+ or ypN+). The FNR of the multi-modal pretreatment ultrasound and tomosynthesis radiomics model was in range with previous yet more invasive efforts of reliably excluding residual cancer after NAST using minimally invasive biopsies. Further prospective validation of our findings seems warranted to confirm our results and enable individualized predictions of NAST outcomes prior to treatment initiation.
Abbreviations
- AUC:
-
Area under the curve
- CI:
-
Confidence interval
- DCA:
-
Decision curve analysis
- DICOM:
-
Digital imaging and communications in medicine
- ER:
-
Estrogen receptor
- FNR:
-
False-negative rate
- FPR:
-
False-positive rate
- GLSZM:
-
Gray level size zone matrix
- HER-2:
-
Human epidermal growth factor receptor 2
- ML:
-
Mediolateral
- MLO:
-
Mediolateral oblique
- NAST:
-
Neoadjuvant systemic treatment
- NPV:
-
Negative-predictive value
- PCCM:
-
Pearson correlation coefficient matrix
- pCR:
-
Pathologic complete response
- PPV:
-
Positive-predictive value
- PR:
-
Progesterone receptor
- RFE:
-
Recursive feature selection
- SVM:
-
Supporting vector machine
References
Korde LA, Somerfield MR, Carey LA et al (2021) Neoadjuvant chemotherapy, endocrine therapy, and targeted therapy for breast cancer: ASCO guideline. J Clin Oncol 39:1485–1505. https://doi.org/10.1200/JCO.20.03399
Conforti F, Pala L, Bagnardi V et al (2022) Surrogacy of pathologic complete response in trials of neoadjuvant therapy for early breast cancer: critical analysis of strengths, weaknesses, and misinterpretations. JAMA Oncol. https://doi.org/10.1001/jamaoncol.2022.3755
Rajkomar A, Dean J, Kohane I (2019) Machine learning in medicine. N Engl J Med 380:1347–1358. https://doi.org/10.1056/NEJMra1814259
Li F, Yang Y, Wei Y et al (2021) Deep learning-based predictive biomarker of pathological complete response to neoadjuvant chemotherapy from histological images in breast cancer. J Transl Med 19:348. https://doi.org/10.1186/s12967-021-03020-z
Jiang M, Li C-L, Luo X-M et al (2021) Ultrasound-based deep learning radiomics in the assessment of pathological complete response to neoadjuvant chemotherapy in locally advanced breast cancer. Eur J Cancer 147:95–105. https://doi.org/10.1016/j.ejca.2021.01.028
Peng S, Chen L, Tao J et al (2021) Radiomics analysis of multi-phase DCE-MRI in predicting tumor response to neoadjuvant therapy in breast cancer. Diagnostics (Basel) 11:2086. https://doi.org/10.3390/diagnostics11112086
Bitencourt A, Daimiel Naranjo I, Lo Gullo R, Rossi Saccarelli C, Pinker K (2021) AI-enhanced breast imaging: where are we and where are we heading? Eur J Radiol 142:109882. https://doi.org/10.1016/j.ejrad.2021.109882
Gu J, Tong T, He C et al (2022) Deep learning radiomics of ultrasonography can predict response to neoadjuvant chemotherapy in breast cancer at an early stage of treatment: a prospective study. Eur Radiol 32:2099–2109. https://doi.org/10.1007/s00330-021-08293-y
Hussain L, Huang P, Nguyen T et al (2021) Machine learning classification of texture features of MRI breast tumor and peri-tumor of combined pre- and early treatment predicts pathologic complete response. Biomed Eng Online 20:63. https://doi.org/10.1186/s12938-021-00899-z
Yu F, Hang J, Deng J et al (2021) Radiomics features on ultrasound imaging for the prediction of disease-free survival in triple negative breast cancer: a multi-institutional study. Br J Radiol 94:20210188. https://doi.org/10.1259/bjr.20210188
Ghadimi M, Sapra A (2021) Magnetic resonance imaging contraindications. In: StatPearls. StatPearls Publishing, Treasure Island (FL)
Kerlikowske K, Su Y-R, Sprague BL et al (2022) Association of screening with digital breast tomosynthesis vs digital mammography with risk of interval invasive and advanced breast cancer. JAMA 327:2220–2230. https://doi.org/10.1001/jama.2022.7672
Ahman H, Thompson L, Swarbrick A, Woodward J (2009) Understanding the advanced signal processing technique of real-time adaptive filters. J Diagn Med Sonogr 25:145–160. https://doi.org/10.1177/8756479309334354
Guo S, Huang X, Xu C et al (2023) Multiregional radiomic model for breast cancer diagnosis: value of ultrasound-based peritumoral and parenchymal radiomics. Quant Imag Med Surg 13:3127139–3123139. https://doi.org/10.21037/qims-22-939
Zwanenburg A, Leger S, Vallières M, Löck S (2020) Image biomarker standardisation initiative. Radiology 295:328–338. https://doi.org/10.1148/radiol.2020191145
Yip SSF, Aerts HJWL (2016) Applications and limitations of radiomics. Phys Med Biol 61:R150–R166. https://doi.org/10.1088/0031-9155/61/13/R150
Collewet G, Strzelecki M, Mariette F (2004) Influence of MRI acquisition protocols and image intensity normalization methods on texture classification. Magn Reson Imaging 22:81–91. https://doi.org/10.1016/j.mri.2003.09.001
van Timmeren JE, Cester D, Tanadini-Lang S, Alkadhi H, Baessler B (2020) Radiomics in medical imaging-“how-to” guide and critical reflection. Insights Imaging 11:91. https://doi.org/10.1186/s13244-020-00887-2
van Griethuysen JJM, Fedorov A, Parmar C et al (2017) Computational radiomics system to decode the radiographic phenotype. Cancer Res 77:e104–e107. https://doi.org/10.1158/0008-5472.CAN-17-0339
Kim JH (2019) Multicollinearity and misleading statistical results. Korean J Anesthesiol 72:558–569. https://doi.org/10.4097/kja.19087
Demircioğlu A (2021) Measuring the bias of incorrect application of feature selection when using cross-validation in radiomics. Insights Imaging 12:172. https://doi.org/10.1186/s13244-021-01115-1
Darst BF, Malecki KC, Engelman CD (2018) Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genet 19:65. https://doi.org/10.1186/s12863-018-0633-8
Liu Y, Chen P-HC, Krause J, Peng L (2019) How to read articles that use machine learning: users’ guides to the medical literature. JAMA 322:1806–1816. https://doi.org/10.1001/jama.2019.16489
Bossuyt PM, Reitsma JB, Bruns DE et al (2003) Reporting studies of diagnostic accuracy according to a standard method; the Standards for Reporting of Diagnostic Accuracy (STARD). Ned Tijdschr Geneeskd 147:336–340
Collins GS, Reitsma JB, Altman DG, Moons KGM (2015) Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 162:55–63. https://doi.org/10.7326/M14-0697
Ethem A (2020) Introduction to machine learning. In: MIT Press. https://mitpress.mit.edu/9780262043793/introduction-to-machine-learning/. Accessed 5 Jan 2023
Vickers AJ, Woo S (2022) Decision curve analysis in the evaluation of radiology research. Eur Radiol 32:5787–5789. https://doi.org/10.1007/s00330-022-08685-8
Venkatraman ES (2000) A permutation test to compare receiver operating characteristic curves. Biometrics 56:1134–1138. https://doi.org/10.1111/j.0006-341x.2000.01134.x
Newcombe RG (1998) Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med 17:857–872. https://doi.org/10.1002/(sici)1097-0258(19980430)17:8<857::aid-sim777>3.0.co;2-e
Harrell FE, Lee KL, Mark DB (1996) Multivariable prognostic models: issues in develo** models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 15:361–387. https://doi.org/10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4
Spiegelhalter DJ (1986) Probabilistic prediction in patient management and clinical trials. Stat Med 5:421–433. https://doi.org/10.1002/sim.4780050506
van Ramshorst MS, van der Voort A, van Werkhoven ED et al (2018) Neoadjuvant chemotherapy with or without anthracyclines in the presence of dual HER2 blockade for HER2-positive breast cancer (TRAIN-2): a multicentre, open-label, randomised, phase 3 trial. Lancet Oncol 19:1630–1640. https://doi.org/10.1016/S1470-2045(18)30570-9
Kuerer HM, Smith BD, Krishnamurthy S et al (2022) Eliminating breast surgery for invasive breast cancer in exceptional responders to neoadjuvant systemic therapy: a multicentre, single-arm, phase 2 trial. Lancet Oncol 23:1517–1524. https://doi.org/10.1016/S1470-2045(22)00613-1
Heil J, Pfob A, Sinn H-P et al (2022) Diagnosing pathologic complete response in the breast after neoadjuvant systemic treatment of breast cancer patients by minimal invasive biopsy: oral presentation at the San Antonio breast cancer symposium on Friday, December 13, 2019, Program Number GS5-03. Ann Surg 275:576–581. https://doi.org/10.1097/SLA.0000000000004246
Pfob A, Sidey-Gibbons C, Rauch G et al (2022) Intelligent vacuum-assisted biopsy to identify breast cancer patients with pathologic complete response (ypT0 and ypN0) after neoadjuvant systemic treatment for omission of breast and axillary surgery. J Clin Oncol 40:1903–1915. https://doi.org/10.1200/JCO.21.02439
Riedel F, Heil J, Golatta M et al (2019) Changes of breast and axillary surgery patterns in patients with primary breast cancer during the past decade. Arch Gynecol Obstet 299:1043–1053. https://doi.org/10.1007/s00404-018-4982-3
Pesapane F, Rotili A, Agazzi GM et al (2021) Recent radiomics advancements in breast cancer: lessons and pitfalls for the next future. Curr Oncol 28:2351–2372. https://doi.org/10.3390/curroncol28040217
Yang M, Liu H, Dai Q et al (2022) Treatment response prediction using ultrasound-based pre-, post-early, and delta radiomics in neoadjuvant chemotherapy in breast cancer. Front Oncol 12:748008. https://doi.org/10.3389/fonc.2022.748008
Mao N, Shi Y, Lian C et al (2022) Intratumoral and peritumoral radiomics for preoperative prediction of neoadjuvant chemotherapy effect in breast cancer based on contrast-enhanced spectral mammography. Eur Radiol 32:3207–3219. https://doi.org/10.1007/s00330-021-08414-7
Li Q, Huang Y, **ao Q et al (2022) Value of radiomics based on CE-MRI for predicting the efficacy of neoadjuvant chemotherapy in invasive breast cancer. Br J Radiol 95:20220186. https://doi.org/10.1259/bjr.20220186
Zhou J, Lu J, Gao C et al (2020) Predicting the response to neoadjuvant chemotherapy for breast cancer: wavelet transforming radiomics in MRI. BMC Cancer 20:100. https://doi.org/10.1186/s12885-020-6523-2
Liu Z, Li Z, Qu J et al (2019) Radiomics of multiparametric MRI for pretreatment prediction of pathologic complete response to neoadjuvant chemotherapy in breast cancer: a multicenter study. Clin Cancer Res 25:3538–3547. https://doi.org/10.1158/1078-0432.CCR-18-3190
Li C, Lu N, He Z et al (2022) A noninvasive tool based on magnetic resonance imaging radiomics for the preoperative prediction of pathological complete response to neoadjuvant chemotherapy in breast cancer. Ann Surg Oncol 29:7685–7693. https://doi.org/10.1245/s10434-022-12034-w
Kamiya S, Satake H, Hayashi Y et al (2022) Features from MRI texture analysis associated with survival outcomes in triple-negative breast cancer patients. Breast Cancer 29:164–173. https://doi.org/10.1007/s12282-021-01294-1
Ko ES, Lee BH, Kim HA, Noh WC, Kim MS, Lee SA (2010) Triple-negative breast cancer: correlation between imaging and pathological findings. Eur Radiol 20:1111–1117. https://doi.org/10.1007/s00330-009-1656-3
Moore JX, Han Y, Appleton C, Colditz G, Toriola AT (2020) Determinants of mammographic breast density by race among a large screening population. JNCI Cancer Spectr 4:pkaa010. https://doi.org/10.1093/jncics/pkaa010
Potnis KC, Ross JS, Aneja S, Gross CP, Richman IB (2022) Artificial intelligence in breast cancer screening: evaluation of FDA device regulation and future recommendations. JAMA Intern Med 182:1306–1312. https://doi.org/10.1001/jamainternmed.2022.4969
Pfob A, Sidey-Gibbons C (2022) Systematic bias in medical algorithms: to include or not include discriminatory demographic information? JCO Clin Cancer Inform 6:e2100146. https://doi.org/10.1200/CCI.21.00146
Mittendorf EA, Zhang H, Barrios CH et al (2020) Neoadjuvant atezolizumab in combination with sequential nab-paclitaxel and anthracycline-based chemotherapy versus placebo and chemotherapy in patients with early-stage triple-negative breast cancer (IMpassion031): a randomised, double-blind, phase 3 trial. Lancet 396:1090–1100. https://doi.org/10.1016/S0140-6736(20)31953-X
Duron L, Savatovsky J, Fournier L, Lecler A (2021) Can we use radiomics in ultrasound imaging? Impact of preprocessing on feature repeatability. Diagn Interv Imaging 102:659–667. https://doi.org/10.1016/j.diii.2021.10.004
Acknowledgements
We are grateful to Dr. Manuel Feisst for his advice on statistical analysis. We used the open-source “R” programming language (The R Foundation for Statistical Computing). The machine learning frameworks used in this study (TensorFlow and Keras) are available at https://github.com/tensorflow/tensorflow and https://github.com/keras-team/keras.
Data sharing statement
Will individual participant data be available (including data dictionaries)?
Yes
What data in particular will be shared?
Individual participant data that underlie the results reported in this article, after deidentification (text, tables, figures, and appendices).
What other documents will be available?
none
When will data be available (start and end dates)?
Immediately following publication. No end date.
With whom?
Researchers who provide a methodologically sound proposal.
For what types of analyses?
To achieve aims in the approved proposal.
By what mechanism will data be made available?
Proposals should be directed to andre.pfob@med.uni-heidelberg.de
To gain access, data requestors will need to sign a data access agreement.
Funding
Open Access funding enabled and organized by Projekt DEAL. The authors state that this work has not received any funding.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Guarantor
The scientific guarantor is André Pfob.
Conflict of Interest
The authors declare no competing interests.
Statistics and Biometry
No complex statistical methods were necessary for this paper.
Informed Consent
Written informed consent was not required for this study, because this is a retrospective study, and the objective is pursued through the retrospective analysis of internal data collected, used, and stored in the context of clinical routine for the purpose of treatment.
Ethical Approval
This was approved by the Ethics Committee of Heidelberg University Medical Faculty (S-092/2022).
Study subjects or cohorts overlap
This study cohort has not been previously reported.
Methodology
-
retrospective
-
diagnostic study
-
performed at one institution
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
ESM 1
(PDF 257 kb)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Cai, L., Sidey-Gibbons, C., Nees, J. et al. Can multi-modal radiomics using pretreatment ultrasound and tomosynthesis predict response to neoadjuvant systemic treatment in breast cancer?. Eur Radiol 34, 2560–2573 (2024). https://doi.org/10.1007/s00330-023-10238-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00330-023-10238-6