Introduction

Lung cancer ranks first in cancer mortality around the world [1]. With the popularization of computed tomography (CT) and the application of low-dose CT for lung cancer screening, substantial early-stage lung cancers have been detected [2]. Most malignant pulmonary nodules are confirmed as adenocarcinoma by pathology [3]. Patients with different types of adenocarcinoma differ in 5-year survival probabilities; e.g., patients with a diagnosis of invasive adenocarcinoma (IA) have a significantly poorer survival probability than those with adenocarcinoma in situ (AIS) or minimally invasive adenocarcinoma (MIA), who have a nearly 100% survival probability [4, 5]. Currently, lobectomy may be a better choice than sublobar resection for patients with IA, and patients with preinvasive lesions (atypical adenomatous hyperplasia (AAH) and AIS) and MIA (collectively PM) are candidates for limited resections [6].

Three methods are most commonly used to perform intraoperative or preoperative diagnosis in clinical practice, namely chest CT scan, biopsy, and intraoperative frozen section (FS). Many radiological studies rely on morphological (semantic) features such as spiculation or lobulation to generate a differential diagnosis. However, qualitative interpretation of the image is hampered by the strong subjectivity introduced by atypical radiology signs, especially in small and in ground-glass nodules [7,8,9,10]. Moreover, transbronchial and percutaneous biopsies are limited by the difficulties of sampling and localization [26]. Moreover, a population-based prospective study indicated that the risk factor for develo** lung cancer increases with age and with a family history of lung cancer for female patients [27]. However, in this study, only age and gender significantly differ between cohorts diagnosed with IA and PM, with males older than 60 years having a significantly higher probability to be diagnosed with IA. Age has been reported elsewhere to increase the risk factor of IA diagnosis, while gender differences in the adenocarcinoma spectrum need further study [8,9,10]. Our results also show that a model informed purely on clinical variables has low sensitivity and relatively high specificity for the identification of IA, which may lead to moderate accuracy for diagnosis and low benefit from decision curve. This result, however, should be interpreted with caution, because clinical variables are varied in different populations.

Another study also looked at semantic features, proposing that pulmonary nodules with a larger diameter, located in the upper lobe, spiculation, and PSN (part-solid nodule) had a higher probability to be malignant [27]. However, it has been shown that semi-automated volume analysis is a more robust method than a simple measurement of the diameter to measure the size of the pulmonary nodule [28], and spiculation is an uncommon feature in early-stage lung cancer [8]. Our study finds that nodule diameter and nodule type are significantly different between cohorts diagnosed with IA and PM, with nodules with smaller diameter and pure GGN types increasing the probability of PM diagnosis. These two semantic features by themselves, as well as the semantic model, show high AUC and accuracy values for prediction and diagnosis of IA. Overall, our results indicate both a semantic feature model and a lesion volume model show similar predictive performance compared with radiomics, while radiomics has higher accuracy than semantic and volume models.

It is important also to point out that the ground truth used for diagnosis in this study is fairly unique as resections are not generally considered for pGGNs in guidelines in most countries outside of Asia where pGGNs are followed up until a solid component appears or the tumor progresses [29]. Moreover, pGGN adenocarcinomas are more common in low-risk Asian females than other populations, and the patients more often request surgery. Around 34% of nodules in this study are pGGNs, 30% of which are confirmed as IA, which may reflect doctors’ and patients’ more positive attitudes towards surgery.

In our study, the CT-based radiomics model shows a similar predictive performance with FS in distinguishing IA from PM. Selected features (Wavelet_HLL_Stats_max, Wavelet_LLL_Stats_cov, and LocInt_peakLocal) reflect the distribution of intensity values within the ROI, and another selected feature (GLRLM_LGRE) describes the heterogeneity of the density within the ROI [23]. Lim et al found that the mean density differs between IA and non- or minimally IA [8]. Moreover, a previous study reported that IA tends to appear more heterogeneous on CT images than PM [30]. Therefore, we hypothesize that radiomics features describing density and heterogeneity are related to tumor biology and pathology and are an excellent predictor for identification of IA [25].

CT and positron emission tomography radiomics studies have shown predictive features could be a surrogate of lesion volume and knowledge of which features correlate highly with volume is therefore important [31,32,33]. Upon volume correlation analysis, we excluded one feature that correlated highly with volume and found no change in model performance. The volume was embedded into the radiomics signature since radiomics is synonymous with quantitative imaging; features that contribute to model performance should not be excluded a priori. In this study, a radiomics plus volume model (RV) showed slight improvement of accuracy compared with the radiomics-alone model, and it had similar AUC and accuracy values as the CSRV model. In addition, we found that our models employing radiomics (i.e., radiomics alone, RV, and CSRV) had similar predictive performance (AUC) as the frozen section models. However, the accuracy of these models was lower than that of FS.

Although the FS can be a precise diagnostic method to guide intraoperative resection procedures for lung adenocarcinoma, it remains difficult to recommend a definitive assessment by FS alone [34]. Borczuk suggested that combining clinical and radiologic information with FS could reduce diagnostic errors [35]. Our results show no significant difference in the AUC values between the FSRV and FSV models, but the former model has better accuracy and calibration. Furthermore, we found that the AUC of the CSFSRV model is not significantly different from that of the FSRV model, did not increase the accuracy, and got bad calibration. In addition, the decision curve indicates that the models containing FS all had better performance than the models without FS. Therefore, we conclude that the addition of radiomics (with volume) to FS analysis potentially creates a substantial biomarker for assessing the risk of invasive adenocarcinomas and could be applied in clinical practice.

Nevertheless, this study has certain limitations. First, because of the retrospective data collection, selection bias is unavoidable. Further prospectively international investigation as a registered clinical trial is paramount. Second, different population cohorts, tumor morphology, and CT parameters are known to influence the results of radiomics features [36]. Further external validation datasets are desired to verify the reliability of our model, especially including diverse cohorts to fully capture phenotype heterogeneity. Third, the ROIs were contoured manually, which is time-consuming and highly prone to error. Therefore, a reliable and robust automatic segmentation tool is necessary to address this issue [37], also taking into account, e.g., peritumoral and normal tissue, to increase the accuracy of quantitative image-based models. Fourth, the accuracy and specificity of the FS analysis in our cohort were lower than the results from previous studies [6, 11]. We speculate that we included more small size and GGN cases, which have lower accuracy than larger tumors as most studies found [6, 11, 12]. Future prospects include prospective validation and deep learning methods for automatic segmentation and in combination with the ones described in this study, novel parametric imaging techniques. While this work focuses on the correlation of radiomics features with the underlying biology (histology), future work will also focus on the prediction of clinical outcomes directly, such as overall survival, progression free survival, or response to therapy.

In conclusion, a radiomics signature can be employed as a preoperative tool to distinguish invasive adenocarcinoma from preinvasive lesions or MIA. Furthermore, a multifactorial model combining radiomics with FS analysis is a potential biomarker for assessing the risk of invasive adenocarcinoma during surgery, and this model could help the therapeutic strategy for patients with pulmonary nodules.