Introduction

Hepatocellular carcinoma (HCC) is the most common primary cancer of the liver and accounts for approximately 75–85% of all liver cancers [1]. Although there are many effective treatment methods that can help control tumour progression and improve prognosis, such as surgery, liver transplantation, radiofrequency ablation, and transarterial chemoembolization, the recurrence rate of HCC remains high [2]. Microvascular invasion (MVI) is defined as a microscopic tumour thrombus in the vascular lumen lined by endothelial cells, and it occurs mainly in the branches of the peritumoural portal vein [3]. MVI is a recognised predictive factor for the survival of patients with HCC and recurrence after surgery [4,5,6]. Thus, the ability to preoperatively predict MVI would greatly benefit HCC treatment.

Radiologists can directly describe some semantic features to evaluate MVI, such as tumour size, non-smooth tumour margin, internal arteries, a hypodense halo, and tumour–liver differences [7,8,9]. However, evaluating these features is highly subjective, variable, and lacks robustness [15, 16].

A meta-analysis previously showed that MRI radiomics was better than CT radiomics for sensitivity, specificity, and the area under the receiver operating characteristic (ROC) curve (AUC) [17]. MRI can also provide better soft-tissue resolution, multi-parameters and more-stable features for assessing tumour heterogeneity [18,19,20]. The liver-specific contrast agent gadolinium-ethoxybenzyl-diethylenetriamine pentaacetic acid (Gd-EOB-DTPA) and the hepatobiliary phase (HBP) may provide additional information [21,22,23,24,25]. Therefore, this study aimed to build a model for predicting MVI by extracting and selecting crucial variables from multiple sequences of Gd-EOB-DTPA-enhanced MRI.

Materials and methods

Patient criteria

We enrolled 165 patients who underwent partial hepatectomy for HCC between January 2015 and July 2020 in our hospital; these included 103 men and 62 women, aged 35–83 years, with a median age of 57 years. These patients met the following criteria (Fig. 1): (1) Gd-EOB-DTPA-enhanced MRI examination was performed within two weeks before surgery, (2) there was a postoperative pathological diagnosis of HCC, and (3) laboratory examination data were completed within two weeks before surgery. The exclusion criteria were: (1) local tumour treatment was received before surgery, including ablation therapy, radiotherapy, transarterial chemoembolization, and systemic therapy; (2) invasion or thrombosis of the portal vein, hepatic vein, inferior vena cava, and their main branches; (3) the presence of extrahepatic metastasis; and (4) the presence of artefacts on images. This retrospective study obtained ethical approval (approval number EK2021017), and the requirement for informed consent was waived.

Fig. 1
figure 1

Flowchart of the enrollment of patients

Clinical data

We collected data on age, sex, hepatitis B surface antigen, alpha-fetoprotein (AFP), alanine aminotransferase, aspartate aminotransferase (AST), albumin (ALB), total bilirubin, direct bilirubin, platelets, prothrombin time (PT), and international normalised ratio of each patient. Two experienced pathologists evaluated cirrhosis of the liver parenchyma and MVI according to the Standardization for Diagnosis and Treatment of Primary Hepatic Carcinoma in China (2019 edition). We divided all patients into an MVI-positive or MVI-negative group.

MRI protocols

A Philips 3.0T Achieva MR scanner with a 16-channel abdominal coil was used to perform the abdominal MRI. The MRI plain scan was performed first, including T2-weighted imaging (T2WI) with spectral attenuated inversion recovery, diffusion-weighted imaging (DWI) (b = 0, 800 s/mm2), and in-phase/out-phase T1-weighted imaging (T1WI). The contrast agent Gd-EOB-DTPA (Germany Bayer, Medical Health Co., Ltd.) was injected into the cubital vein at a flow rate of 1.0 ml/s, using 0.1 ml/kg of bodyweight, and rinsed with 20 ml saline after injection. The enhancement scan used the T1 high-resolution isotropic volume excitation sequence (THRIVE). Axial images of the arterial phase (AP), portal venous phase (PP), transitional phase (TP), and HBP were collected at 20 s, 1 min, 3 min, and 20 min after injection of the contrast agent. The parameters of the scan sequences are listed in Table 1.

Table 1 The parameters of the scan sequences

Image analysis

Two experienced radiologists, with more than eight years of experience in abdominal disease diagnosis and blinded to the pathology results, evaluated the radiological features and reached a consensus. We recorded the multifocality of the lesions, tumour size, signal uniformity on T2WI, peritumoural enhancement in the AP, tumour capsule in the PP, tumour margin, and peritumoural hypointensity in the HBP. A smooth tumour margin presents as a nodular tumour with smooth contour, while a non-smooth tumour margin is characterized by a lobulated tumour or irregular protrusions into the surrounding normal liver parenchyma. Peritumoural hypointensity is defined as an irregular, wedge-shaped, flame-like hypointense area of the liver parenchyma located outside of the tumour margin. The signal of the hypointense area is lower than that of the normal liver parenchyma but higher than that of the tumour itself. A few HCC lesions in the HBP show isointensity or hyperintensity; the signal intensity of the peritumoural hypointense area is lower than that of the lesion, but were less hypointense than the hypointense rim that is usually depicted in those HCCs [26]. When there were multiple lesions, the largest lesion was selected. A physician who was not involved in the radiological and radiomics analyses checked the radiological and pathological findings and marked the largest HCC lesion. These subjective radiological features and clinical data are collectively referred to as the clinicoradiological data.

Drawing the regions of interest

The original images were imported into the Philips Radiomics Tool 1.9.2 (Philips Investment Co., Ltd., Shanghai, China) software (based on Pyradiomics 3.0.0). The grey level normalization was performed before feature extraction, and the grey level discretization was undertaken using a bin width of 2.5. The first radiologist who was not involved in the radiological analysis manually delineated the region of interest (ROI) by selecting the largest tumour section on the T2WI, DWI (b = 800 s/mm2), T1WI plain scan, AP, PP, TP, and HBP images (Fig. 2). When multiple lesions were present, only the largest lesion was delineated. Another senior physician confirmed the ROI outlined by the first radiologist. Two months later, the first radiologist drew the ROI on images from 20 randomly selected patients, and these data were used to calculate the intraclass correlation coefficients (ICCs).

Fig. 2
figure 2

The delineation of ROI on T2WI (a), DWI (b), T1WI (c), AP (d), PP (e), TP (f) and HBP (g). The ROI was drawn as close to the margin of the tumour as possible but within the margin on the largest section

Feature extraction

Feature extraction was performed using Philips Radiomics Tool software (based on Pyradiomics 3.0.0). Each sequence had 1227 features, included 17 shape features presented by statistical values, 19 first-order features that indicated the distribution of intensities, 75 texture features (included the grey-level size-zone matrix [GLSZM], grey-level run-length matrix [GLRLM], grey-level co-occurrence matrix [GLCM], neighbouring grey-tone difference matrix [NGTDM], and grey-level dependence matrix [GLDM]) that quantified the intratumoural heterogeneity, and 1116 first-order features and texture features that were obtained using exponential, logarithmic, squared, square root, or wavelet filtering.

Feature selection

ICCs were used to evaluate the stability of each radiomics feature value. Features with an ICC ≥ 0.75 were used for further feature selection. The Z-score was used to standardise the features before putting them in the model. The overall cohort was divided at a ratio of 7:3 into a training set (n = 115) and a validation set (n = 50), and there were no significant differences in the baseline data between the training and validation sets. In the training set, stepwise regression was used to select the clinicoradiological variables. The least absolute shrinkage and selection operator (LASSO) algorithm was used to select important radiomics features by tenfold cross-validation. The LASSO algorithm was also used to determine variables from the combined dataset composed of the clinicoradiological data and radiomics features with ICC ≥ 0.75. The validation set was used to test the accuracy of the model predictions. A correlation coefficient heatmap was generated to show the correlation between clinicoradiological variables and radiomics features.

Model construction and evaluation

Based on the selected variables, multivariate logistic regression was used to construct the clinicoradiological model, radiomics model, and combined model. The AUC, sensitivity, and specificity were used to evaluate the diagnostic performance of the models. Univariate logistic regression analysis was performed between the selected variables and the MVI. A nomogram risk-prediction model was constructed based on the variables included in the combined model. The nomogram model intuitively shows scores based on the coefficients of the logistic regression model variables and converts them into probabilities of clinical events. The nomogram model was internally validated using 1000 random bootstrap resamplings. The discrimination and calibration of the nomogram model was evaluated using the C-index and calibration curve.

Statistical analysis

All statistical analyses were performed using the R software (version 3.6.0). The Kolmogorov–Smirnov test and analysis of variance were used to test for distribution normality and homogeneity of variance. The t-test or Mann–Whitney rank test was used to compare the differences in the quantitative data, and the chi-square test or Fisher’s exact test was used to compare the differences in the qualitative data. The LASSO algorithm and ROC curve were performed using the “glmnet” and “pROC” packages, and the “rms” package generated the nomogram and calibration curve. The correlation coefficient heatmap was drawn using the “corrplot” package. A P value < 0.05 indicated statistical significance.

Results

Patients

Among the 165 HCC patients, 49 were MVI positive and 116 were MVI negative. In the clinicoradiological model, radiomics model, and combined model, the baseline data between the training and validation sets were not statistically different (P > 0.05). The baseline data of the clinicoradiological model are shown in Table 2.

Table 2 Comparison of the baseline data of the clinicoradiological data

Construction of the clinicoradiological model

The tumour capsule, tumour margin, peritumoural hypointensity, tumour size, AST, ALB, and PT were selected as significant variables related to MVI. Multivariate logistic regression was used to construct the clinicoradiological model.

The formula of the model was as follows:

$$\begin{aligned} \log ({\text{p}}/(1 - {\text{p}})) & = 5.6642 - 0.6055 \times {\text{tumour capsule}} + {1}.{4956} \times {\text{tumour margin}} + {1}.{8169} \\ & \quad \times {\text{peritumoural hypointensity}} + 0.{2511} \times {\text{tumour size}} \\ & \quad - 0.0{233} \times {\text{AST}} - 0.0{998} \times {\text{ALB}} + {2}.0{827} \times {\text{PT}} \\ \end{aligned}$$

where p is the probability of MVI. The univariate logistic regression analysis of the variables in the validation set showed that peritumoural hypointensity had the highest odds ratio (OR) (Table 3).

Table 3 The univariate logistic regression analysis of each variable with OR in the validation set of the clinicoradiological model

Construction of the radiomics model

The proportion of radiomics features with an intra-observer ICC ≥ 0.75, 0.5–0.75, and < 0.5 were 86.35%, 10.48%, and 3.17%, respectively. The LASSO algorithm was used to select features with ICC ≥ 0.75 further. Ultimately, eight radiomics features were selected to construct the radiomics model; four were from the DWI, and the other four were from T2WI, PP, TP, and T1WI.

The formula of the model was as follows:

$$\begin{aligned} \log ({\text{p}}/(1 - {\text{p}})) & = - {8}.{6134} - \, 0.{3}0{37} \times {\text{TP}}\_{\text{Wavelet}}.{\text{GLDM}}.{\text{LLH}}\_{\text{DependenceVariance }}({\text{DV}}) \, \\ & \quad + {6}.{7528} \times {\text{PP}}\_{\text{GLCM}}\_{\text{InverseDifferenceNormalized }}({\text{IDN}}) \\ & \quad + 0.0{29}0 \times {\text{T1WI}}\_{\text{Exponential}}.{\text{GLDM}}\_{\text{DV }} \\ & \quad + 0.000{8} \times {\text{DWI}}\_{\text{Wavelet}}.{\text{FirstOrder}}.{\text{LLL}}\_{\text{Range}} \\ & \quad - \, 0.000{9} \times {\text{DWI}}\_{\text{Wavelet}}.{\text{GLSZM}}.{\text{HLL}}\_{\text{SizeZoneNon}} - {\text{uniformity }}({\text{SZN}}) \, \\ & \quad + 0.{138}0 \times {\text{DWI}}\_{\text{SquareRoot}}.{\text{GLSZM}}\_{\text{SZN }} \\ & \quad + 0.{3}0{4}0 \times {\text{DWI}}\_{\text{Logarithm}}.{\text{GLSZM}}\_{\text{SmallAreaHighGrayLevelEmphasis }}({\text{SAHGLE}}) \\ & \quad + 0.00{4}0 \times {\text{T2WI}}\_{\text{Wavelet}}.{\text{FirstOrder}}.{\text{HLL}}\_{\text{Maximum}} \\ \end{aligned}$$

The univariate logistic regression of the variables in the validation set showed that SquareRoot.GLSZM_SZN from the DWI had the highest OR, followed by Wavelet.FirstOrder.HLL_Maximum from the T2WI and Wavelet.GLSZM.HLL_SZN from the DWI (Table 4).

Table 4 The univariate logistic regression analysis of each variable with OR in the validation set of the radiomics model

The correlation between clinicoradiological variables and the above 8 radiomics features is shown in the correlation coefficient heatmap (Fig. 3).

Fig. 3
figure 3

The correlation coefficient heatmap for clinicoradiological variables and radiomics features selected in the radiomics model. The larger the value or the darker the color is, the stronger the correlation is

Construction of the combined model

The combined data comprised 20 clinicoradiological variables and radiomics features with ICC ≥ 0.75; the tumour margin, peritumoural hypointensity, and seven radiomics features were selected by the LASSO algorithm to construct the combined model (Fig. 4). The tumour margin and peritumoural hypointensity were significant variables in both the clinicoradiological model and the combined model. The size zone non-uniformity (SZN) from the DWI, small area high grey-level emphasis (SAHGLE), and Maximum from T2WI were significant variables in both the radiomics model and the combined model.

Fig. 4
figure 4

LASSO algorithm used for the combined model. a The variation of the coefficients of the variables with the penalty coefficient (λ) b Use tenfold cross-validation to select λ. When the binomial deviation was the smallest (minimum standard), nine nonzero coefficients were determined

The specific formula of the model was:

$$\begin{aligned} \log ({\text{p}}/(1 - {\text{p}})) & = - {8}.{8963} + 0.{793}0 \times {\text{tumourmargin}} + {2}.{17}00 \times {\text{peritumoural hypointensity}} \\ & \quad - {11}.{3837} \times {\text{HBP}}\_{\text{logarithm}}.{\text{GLSZM}}\_{\text{SmallAreaLowGrayLevelEmphasis }}({\text{SALGLE}}) \\ & \quad - { 9}.{3923} \times {\text{PP}}\_{\text{wavelet}}.{\text{GLCM}}.{\text{HLL}}\_{\text{IDN }} \\ & \quad + {17}.{1969} \times {\text{DWI}}\_{\text{wavelet}}.{\text{FirstOrder}}.{\text{HLL}}\_{\text{Skewness }} \\ & \quad + 0.0{4}00 \times {\text{HBP}}\_{\text{shapebased}}\_{\text{MeshVolume }} \\ & \quad + 0.0{8}0{4} \times {\text{DWI}}\_{\text{squareRoot}}.{\text{GLSZM}}\_{\text{SZN }} \\ & \quad + 0.{2524} \times {\text{DWI}}\_{\text{logarithm}}.{\text{GLSZM}}\_{\text{SAHGLE }} \\ & \quad + 0.00{17} \times {\text{T2WI}}\_{\text{wavelet}}.{\text{Firstorder}}.{\text{HLL}}\_{\text{Maximum}} \\ \end{aligned}$$

The results of the univariate logistic regression in the validation set are presented in a forest plot in Fig. 5. Peritumoural hypointensity had the highest OR, followed by the tumour margin.

Fig. 5
figure 5

Forest plot showing the univariate logistic regression analysis of each variable with OR in the validation set of the combined model

Construction of the nomogram model

A nomogram risk-prediction model was established based on the variables in the combined model (Fig. 6). The C-index of the nomogram model was 0.79 (95% CI: 0.68–0.83). The calibration curve showed that the predicted probability from the nomogram model was close to the actual probability (Fig. 7). The mean absolute error between the predicted probability and the actual occurrence of MVI was 0.03, which indicated that the nomogram model had good calibration.

Fig. 6
figure 6

Construction of the nomogram. The point of each variable was added up to obtain the total points. The total point corresponded to the risk probability of predicting MVI. The nomogram can be used to predict the risk probability of MVI for each patient

Fig. 7
figure 7

Calibration curve of the nomogram model

Performance of the models

The performances of the three models for predicting MVI are shown in Table 5. The AUC of the combined model was higher than that of the clinicoradiological and radiomics models in the training set (0.841, 0.782, and 0.715, respectively). Similar results were observed in the validation set, with AUC values of 0.826, 0.755, and 0.708, respectively. The AUC of the radiomics model was higher than that of the clinicoradiological model in both the training and validation sets. The combined model had the highest sensitivity for predicting MVI (81.43% in the training set and 90.89% in the validation set); the clinicoradiological model had the highest specificity (91.67% in the training set and 92.07% in the validation set).

Table 5 The performances of the models in predicting MVI

Discussion

In previous studies, the radiomics features from T2WI, T1WI, DWI, apparent diffusion coefficient (ADC) map, AP, and PP had specific values for predicting MVI [27,28,29,30]. However, there are few radiomics studies based on Gd-EOB-DTPA-enhanced MRI [21,22,23,24,25]. Although some studies selected features from the HBP and achieved excellent performance [21,22,23], models composed of features from multiple sequences of Gd-EOB-DTPA-enhanced MRI lacked discusssion. This study selected radiomics features from multiple sequences and used logistic regression to establish models for predicting MVI in HCC. Contrary to previous studies [21,22,23], the combined model of this study used the LASSO algorithm to select features from the combined dataset to avoid the collinearity between the clinicoradiological variables and radiomics features. We also used a correlation coefficient heatmap to show the possible correlation between the clinicoradiological variables and radiomics features.

The results showed that, among the variables included in the clinicoradiological model, only the tumour margin and peritumoural hypointensity in the HBP were stable in the combined model. In the radiomics model, the SZN from DWI, SAHGLE, and Maximum from T2WI were important radiomics features in the combined model.

In both the clinicoradiological model and the combined model, the OR for the tumour margin and peritumoural hypointensity were significantly higher than that of the other variables in the validation set. Several studies have demonstrated that peritumoural hypointensity in the HBP is an independent risk factor for predicting MVI [26, 31]. Peritumoural hypointensity may be related to peritumoural hemodynamic changes caused by the tumour thrombus obstructing the microbranches of the portal venous vein [31]. Chou et al. [32] studied 60 HCC specimens with histopathological evidence of MVI, and 40 showed evidence of focal extra-nodular extension. In 36 of 40 HCC specimens with focal extra-nodular extension, non-smooth margins on CT images were located in the same octant as MVI in the histopathological specimens. This finding is consistent with our research and other radiological studies, which reported that the non-smooth tumour margin is an MVI-related risk factor [33, 34].

Some studies have shown that tumour size and an incomplete tumour capsule are also independent risk factors for predicting MVI [35, 36]. These two variables were essential in the clinicoradiological model in this study, but they were not included in the combined model. The high correlation between radiomics features and the two radiological variables may explain this finding. The correlation coefficient heatmap showed that the tumour size and tumour margin are highly correlated with some radiomics features.

The OR of the SZN from the DWI was higher than that of other radiomics features in both the radiomics model and the combined model in the validation set. SZN measures the variability of size zone volumes in the image. Jiang et al. [37] found that the SZN from the PP of enhanced CT is an essential feature for predicting MVI in HCC. Ma et al. [38] reported that the Maximum from the AP and delayed phase of CT was another important MVI-predicting factor. Maximum is the maximum grey level intensity within the ROI. Although the above radiomics features were obtained from contrast-enhanced CT, the SZN and Maximum may be relatively stable for predicting MVI. Most of the radiomics features in the radiomics model and the combined model are matrix-based texture features, and texture features can reflect the heterogeneity of the tumour; therefore, MVI may be related to the heterogeneity of the tumour [31, 39].

Multifocality was not a significant factor affecting MVI in this study. Thus, the cases with multiple lesions were not excluded in this study. We only selected the largest lesion to draw ROI, which may confound the possible effects of multiple center carcinogenesis on outcomes. The prediction model constructed by Yang et al. [21], which combined AFP, non-smooth tumour margin, peritumoural enhancement, HBP T1WI signature, and HBP T1 map signature based on Gd-EOB-DTPA-enhanced MRI, reached an AUC of 0.861 in the validation set. The AUC of the combined model of this study in the validation set was 0.826, which was lower than that of the above study [21]. This might be explained by the fact that our study only extracted features from the largest section of the tumour because of the huge workload. Therefore, some additional information might have been missed. Nevertheless, combining different clinical or radiological variables has a significant effect on the outcomes. In addition, the combined model in our study included variables selected from the combined data, and was thus different from other combined models constructed by merging the variables included in a clinical model or radiomics model. Gitto et al. [40] found that 3D and 2D MRI-based radiomics analyses of cartilaginous bone tumours provide similar rates of stable features, and the 2D approach is easier to implement in clinical practice. Thus, single segmentation of HCC may provide relatively equivalent value for predicting MVI, which may help reduce the workload. Although we used one section to extract radiomics features, the nomogram risk-prediction model based on the combined model still showed good discrimination and value.

This study has some limitations. First, this study used a small sample size, and it was a retrospective, single-centre study without external validation and lacked robustness. Second, this study delineated the ROI at the maximum cross-section of the tumour other than the whole tumour volume, and manual delineation could be affected by the subjectivity of radiologists. Third, we only assessed the intra-observer agreement, and the failure to evaluate inter-observer agreement might affect the stability of the results, given that different observers are subjective in delineating the ROI. Fourth, this study selected essential features from all the sequences but did not compare the value of a single sequence or the combination of different sequences, as the features selected from a single sequence might not always have value in a model based on multiple sequences and the features in a multi-sequence model may not be stable in a single-sequence model [29].

Conclusions

In summary, two semantic features, peritumoural hypointensity and tumour margin, which are highly correlated with MVI combined with multi-sequences radiomics features have potential value in predicting MVI preoperatively, and the nomogram based on the combined variables can intuitively provide the probability of MVI and guide clinical decisions.