Introduction

Cervical cancer (CC) is a common malignancy of the female reproductive tract [1]. In CC patients, disease progression is closely related to pelvic lymph node (LN) metastasis (LNM) status [2,3,4], as early-stage CC patients with and without pelvic LNM exhibit respective 5-year survival rates of 65% and 90% [5]. Postoperative radiotherapy is the most common recommendation for treating CC patients whose postoperative pathological findings reveal evidence of pelvic LNM [6], underscoring the need to accurately judge the pelvic LNM status of these patients so that staging can be performed accurately, prognostic estimates can be generated, and treatment strategies can be planned [7]. Currently, clinical efforts to diagnose pelvic LNM are primarily based on LN morphological characteristics derived from magnetic resonance imaging (MRI), with LN short-axial diameter measurements being the most frequently used in this context [8]. However, this strategy yields relatively low sensitivity rates of 30.3–72.9% when attempting to differentiate between metastatic and non-metastatic LNs [9, 10]. When the biopsy of sentinel LNs can yield a high degree of accuracy and sensitivity, it is an invasive strategy, and the resultant data can be impacted by the skill level of the clinician [11]. It thus remains highly challenging to effectively predict LNM status in CC patients prior to surgery.

Radiomics methods entail the extraction of high-dimensional quantitative data from clinical images, providing a means of characterizing microscopic features in tumors or other tissues not visible to the naked eye [12]. Radiomics strategies have increasingly been used to enhance diagnostic accuracy and prognostic predictive efforts for a range of tumor types [13,14,15]. Several reports have also demonstrated the utility of radiomics-based methods as a means of enhancing the accuracy of efforts to predict LNM [16,17,18]. However, there remains a lack of any MRI radiomics-based studies specifically focused on predicting the pelvin LNm status of CC patients.

The present study was thus developed with the goal of establishing and validating an MRI radiomics-based predictive model capable of assessing the LNM status of CC patients.

Materials and methods

Study design and patients

The present retrospective analysis received approval from the hospital Institutional Review Board, which waived the requirement for informed consent. This study included a training cohort composed of 86 consecutive CC patients who were evaluated from June 2016 to June 2021, and a testing cohort composed of 38 consecutive CC patients who were evaluated from July 2021 to October 2022.

Patients eligible for inclusion were as follows: (a) CC patients that had hysteroscope-confirmed diagnoses prior to surgery;, (b) CC patients who underwent conventional MRI and diffusion-weighted imaging (DWI) tests within 7 days prior to surgery, and (c) patients who had undergone pelvic LN dissection. Patients were excluded if they are as follows: (a) exhibited incomplete clinical data or (b) had undergone radiotherapy or chemotherapy prior to surgery. The patients in training and test cohorts were included using the same inclusion and exclusion criteria.

For all study participants, the baseline data and MRI-based radiomics data were collected. The baseline data included age, body mass index (BMI), tumor differentiation, depth of tumor invasion, clinical staging according to the 2018 International Federation of Gynecology and Obstetrics (FIGO) criteria, and serum levels of tumor biomarkers including squamous cell carcinoma antigen (SCC), carbohydrate antigen 199 (CA199), alpha-fetoprotein (AFP), and carcinoembryonic antigen (CEA). The MRI-based radiomics data were extracted from T2WI, fat suppression T2WI, and apparent diffusion coefficient (ADC) sequences.

MRI scanning

A 1.5T MRI instrument (Philips) with a body array coil (Ingenia) was used for all MRI analyses. The scanning sequence for each participant included axial T2WI, axial fat suppression T2WI, axial DWI (b-values: 0 and 800 s/mm2), sagittal T1WI, and sagittal T2WI. Fat suppression was achieved through a spectral attenuated inversion recovery (SPAIR) sequence. On the axial T2-weighted image, the radiologists delineated regions of interest at all levels of the lesion, and then the software automatically generated the volume of interest (VOIs) and copied them to the ADC map. For detailed information regarding each scanning sequence, see Table 1.

Table 1 The parameters of the MRI

Tumor segmentation and feature extraction

Analyses of the sagittal T2WI, axial T2WI-SPAIR, and axial ADC images were conducted, using the 3D Slicer software (v 5.03) to manually draw tumor boundaries to define VOIs. Two radiologists with 5 and 10 years of experience delineated these VOIs while blinded to the pathological results for these patients. For the resultant MRI segmentation, see Fig. 1. The 3D Slicer program was additionally used for feature extraction, and intra- and inter-class coefficient (ICC) values were used to assess observer consistency. MRI images from 20 patients selected at random from the training cohort were independently segmented by the two radiologists for the independent segmentation of target lesions. Reader 1 additionally segmented the tumors from the same 20 patients after a 1-week interval. Only those features exhibiting an ICC ≥ 0.9 were regarded as being highly repeatable such that they were retained for further analysis. Reader 1 was then responsible for segmenting all remaining images.

Fig. 1
figure 1

The figures of MRI segmentation on the sequences of a T2WI, b T2WI-SPAIR, and c ADC

Feature selection

Feature selection was conducted using a three-step process. Initially, the variance threshold method was used to identify all features with a variance > 0.8 for inclusion in the subsequent step. Then the SelecKBest method was implemented, and all features exhibiting a P-value > 0.05 were included in the next step. Lastly, LNM-associated features were selected with a least absolute shrinkage and selection operator (LASSO) regression model. The selected features were used to construct a radiomics signature such that radiomics scores were calculated for all CC patients.

MRI radiomics-based model establishment and validation

The outcome in this study was LNM status, and the MRI radiomics-based model was established to predict the LNM(+). The LNM status was assessed according to the postoperative pathological results. Univariate and multivariate logistic regression analyses were then used to identify LNM-related risk factors in the training cohort to facilitate the combination of radiomics scores, clinical features, and serum biomarker data. A predictive nomogram was then constructed according to the LNM-related risk factors. Area under the receiver operator characteristic (ROC) curve (AUC) was assessed the accuracy of the predictive model. The data in the test cohort were put into the MRI radiomics-based model to validate the accuracy of the predictive model.

Benefits of clinical application of predictive model

To assess the clinical utility of the predictive model, decision curve analysis was utilized to evaluate the net benefit of the predictive model in both training and test cohorts.

Statistical analyses

SPSS 25.0 and R 4.1.2 were used to analyze all data. Categorical data were compared with the χ2 test of Fisher’s exact test. Continuous data were compared with independent sample t-tests and Mann-Whitney U-tests when normally and non-normally distributed, respectively. LNM-related risk factors were identified with univariate and multivariate logistic regression analyses. AUC values of the ROC were compared with the DeLong test.

Results

Characteristics of the training cohort

The training cohort included 86 CC patients (Table 2), including 64 (74.4%) and 22 (25.6%) without and with LNM, respectively. No significant differences in age, BMI, cancer type, or serum cancer biomarker levels were observed when comparing LNM(−) and LNM(+) patients. However, significant differences between these groups were observed with respect to tumor differentiation, tumor invasion depth, and FIGO staging. Specifically, significantly higher proportions of LNM(+) patients were exhibiting poor tumor differentiation (40.1% vs. 6.2%, P < 0.001), cervical stromal invasion depth ≥ 1/2 (68.2% vs. 35.9%, P = 0.017), and FIGO stage 3 cancer (9.1% vs. 1.6%, P < 0.001) as compared to LNM(−) patients.

Table 2 Baseline data of the patients in the training group

Feature selection and radiomics score calculation

In total, 851 radiomics features were extracted per scan sequence (T2WI, T2WI-SPAIR, and ADC). A step-by-step process was then used to select features for these sequences and for a combination of these three scan sequences to facilitate the establishment of a radiomics signature. In total, 16 features were ultimately selected for use when calculating radiomics scores (Supplementary Table 1). Coefficient values for each feature and the mean square error for the combined sequences are presented in Supplementary Figure 1.

Predictive model establishment

Univariate analyses revealed that worse differentiation (P < 0.001), cervical stromal invasion depth ≥ 1/2 (P = 0.01), more advanced FIGO stage (P < 0.001), and higher combined sequence-based radiomics scores (P < 0.001) were all related to CC patient LNM status. In a multivariate analysis, LNM-related risk factors included worse differentiation (P < 0.001), more advanced FIGO stage (P = 0.03), and higher combined sequence-based radiomics scores (P = 0.01, Table 3).

Table 3 Risk factors of the LNM

These results were next used to construct a predictive model, the nomogram for which is presented in Fig. 2. The formula used to compute nomogram scores for this model was as follows: score = −0.0493–2.1410 × differentiation level (0: poor; 1: moderate; 2: well) + 7.7203 × combined sequence radiomics score + 1.6752 × FIGO stage (0: I; 1: II; 2: III). To maximize sensitivity and specificity, we selected a cut-off score of 0.662 (sensitivity = 81.8%, specificity = 85.9%). If the score was greater than or equal to 0.662, the patient was considered to be LNM(+). If the score was less than 0.662, the patient was considered to be LNM(−).

Fig. 2
figure 2

The nomogram of predictive model

The AUC values for the T2WI, T2WI-SPAIR, ADC, and combined sequence radiomics scores, as well as the combined predictive model, were 0.656, 0.664, 0.658, 0.835, and 0.923, respectively (Fig. 3A, Table 4). The AUC for the radiomics score based on the combination of three sequences was significantly larger than the corresponding AUC values for radiomics scores computed based upon the T2WI (P = 0.005), T2WI-SPAIR (P = 0.008), and ADC (P = 0.01) models. The predictive model exhibited a significantly higher AUC value as compared to the combined sequence-based radiomics score (P = 0.04).

Fig. 3
figure 3

The ROC curves of radiomics score of T2WI, radiomics score of T2WI-SPAIR, radiomics score of ADC, radiomics score of combined sequences, and the predictive model in the a training and b test groups

Table 4 Diagnostic performance of each parameter

Model validation

The testing group included 38 patients (Table 2), including 23 (72.2%) and 15 (27.8%) without and with LNM, respectively. No significant differences in baseline data were observed when comparing the training and testing cohorts (Table 2). The AUC values for the T2WI, T2WI-SPAIR, ADC, and combined sequence radiomics scores, as well as the combined predictive model, were 0.643, 0.525, 0.513, 0.826, and 0.82, respectively (Fig. 3B, Table 4). The AUC for the radiomics score based on the combination of three sequences was significantly larger than the corresponding AUC values for radiomics scores computed based upon the T2WI (P = 0.04), T2WI-SPAIR (P = 0.003), and ADC (P = 0.002), respectively. The AUC value for the predictive model was similar to that for the radiomics score based on the combined sequences (P = 0.94).

Potential clinical benefits of the predictive model

Calibration curves revealed a high degree of consistency between predicted and actual LNM status when using the predictive model in both the training and testing cohorts (Fig. 4A). Decision curves generated for this nomogram additionally revealed that this predictive model was associated with net benefits in both patient cohorts, with a risk threshold greater than 0 (Fig. 4B).

Fig. 4
figure 4

The a calibration curves and b decision curve analysis of nomograms of predictive model

Discussion

In patients with CC, the ability to detect LNM prior to surgery is vital for effective treatment planning. Here, MRI-based radiomics score values, FIGO staging, and tumor differentiation were all identified as significant predictors of the LNM status of CC patients. When these factors were combined to develop a predictive model, we can calculate the risk score for each patient. For one, we can compare the patients’ risk score to the cut-off score of the predictive model and predict the probability of the LNM. In addition, we can get the prediction probability directly by referring the nomogram of predictive model according to the patients’ risk score.

Conventional MRI-based approaches to assessing LNM status primarily center on LN sizing, with metastatic LNs being defined as nodes exhibiting a short-axis diameter > 10 mm in most cases [19]. However, this approach yields low sensitivity levels (30–73%) when attempting to differentiate between patients with and without LNM [9, 10]. This is consistent with the observations of Williams et al. [9], who determined that the short-axis diameter of 54.4% of metastatic LNs was less than 10 mm. While Koh et al. [20] suggested respective short-axis diameter thresholds of > 8 mm and > 10 mm when assessing the metastatic status of pelvic and retroperitoneal LNs as a means of improving diagnostic performance, LNs harboring micro-metastases can be normally sized such that they will not be detected through conventional MRI scans.

DWI offers a means of potentially detecting malignancies and metastatic LNs [19]. In their analyses of CC patients, Zhang et al. [19] determined that ADCmean and ADCmin were respectively associated with the highest levels of diagnostic accuracy when evaluating enlarged LNs and normally sized LNs, but the corresponding AUC values for these two parameters were just 0.644 and 0.758. This highlights the potential importance of extracting additional information from images in an effort to better assess tumor heterogeneity and to improve diagnostic performance.

For the present analysis, radiomics features were extracted from the T2WI, T2WI-SPAIR, and ADC sequences, all of which are frequently employed when assessing and analyzing CC and LN status. DWI can provide effective insight into tissue movement at the molecular level, yielding information regarding tumor cell infiltration and diffusion. T2WI sequences can highlight tumor morphology and anatomical structures, enabling the quantification of dimensional and morphological parameters that can enable the detection of relatively subtle tumor invasivity. Radiomics scores based on the combined sequence exhibited good predictive performance in both the training and testing cohorts, with respective AUCs of 0.835 and 0.826, both of which were higher than the AUCs associated with radiomics scores derived from individual T2WI, T2WI-SPAIR, and ADC sequences. This suggests that multiple features should be used when extracting radiomics features in order to improve the predictive performance of these features.

In addition to radiomics scores, poorer tumor differentiation and more advanced FIGO staging were both associated with LNM in CC patients. Both of these factors are indicative of higher-grade malignancies, consistent with greater LNM risk. In line with these results, Huang et al. [21] previously reported that worse differentiation was associated with an increase in the risk of LNM in patients with gastric cancer. Similarly, FIGO stage was also found to be strongly linked to the prognosis of the CC patients [22].

The predictive model developed herein exhibited respective AUC values of 0.923 and 0.82 in the training and validation cohorts. Both of these were higher than the respective AUC values of 0.754 and 0.727 reported in a prior study focused on designing a predictive MRI radiomics-based model for LNM diagnosis in CC patients published by Li et al. [23]. While MRI radiomics scores were implemented in both the present study and this past analysis, Li et al. only included red blood cell counts when assessing clinical risk factors [23], whereas tumor differentiation and FIGO staging were both risk factors that were incorporated into the present analyses. This suggests that differentiation and FIGO staging may offer more representative information regarding a target tumor as compared to red blood cell counts.

In the training cohort, the AUC for the predictive model was significantly higher than that for the combined sequence-based radiomics score alone (0.923 vs. 0.835, P = 0.04). This suggests that the incorporation of clinical data may further improve the diagnostic utility of radiomics scores. However, AUC values did not differ significantly in the testing cohort (0.82 vs. 0.826, P = 0.94). This is likely because there were no significant differences in tumor differentiation and FIGO staging between patients in this cohort with and without LNM.

There are some limitations to these analyses. For one, the retrospective nature of these analyses renders them highly susceptible to potential bias. In addition, this was a single-center study. Additional prospective multicenter validation will thus be essential. Moreover, measurement errors cannot be entirely avoided when manually defining lesion boundaries, and factors such as edema, hemorrhage, necrosis, and degeneration can contribute to such errors. Third, radiomics strategies are often limited in their reproducibility and amenability to standardization, potentially restricting their utility [24].

Conclusions

In summary, the MRI radiomics-based model developed herein exhibited great promise as an accurate tool for predicting the LNM status of patients with CC. Relative to the use of radiomics scores based on any one MRI sequence, the combined predictive model established in this study was associated with significant improvements in overall diagnostic accuracy.