Introduction

Iodinated contrast media are widely used in computed tomography (CT) to enhance tissue contrast, making it easier to evaluate anatomic structures and pathologies. However, iodinated contrast media have potential adverse effects varying from minor physiologic reactions to severe life-threatening situations, although their incidence has decreased with the development of low-osmolar and non-ionic contrast agents1,2. Many chest CT examinations, which are undeniably crucial diagnostic tools to evaluate thoracic disorders, are non-contrast CT (NCCT), especially for screening purposes or initial evaluation. The use of contrast in chest CT is often unnecessary for detecting lung parenchymal lesions. However, contrast-enhanced CT (CECT) plays a critical role in the detailed assessment of the mediastinum, pleura, and vessels.

In recent years, deep learning has been applied to various tasks in medical imaging, including automatic lesion detection, segmentation, or image quality improvement. One of the most interesting current implementations of deep learning in medical imaging is synthetic image generation and the generative adversarial network (GAN) is considered state-of-the-art for such a task3,Image analysis: technical evaluation

We employed the mean absolute error (MAE), peak signal-to-noise ratio (PSNR)14,15, multiscale structural similarity index measurement (MS-SSIM)16, and learned perceptual image patch similarity metric (LPIPS)17 to perform a quantitative evaluation of the tuning set and test set 1. A lower MAE, higher PSNR, higher MS-SSIM, and lower LPIPS indicate higher similarity to the ground truth. MAE and PSNR reflect the absolute numerical difference between two images, whereas MS-SSIM correlates with similarity in the structural composition of pixels14,18. LPIPS is a more recently suggested metric of perceptual distance based on widely used pretrained deep neural networks17,19. For comparison, we calculated the metrics for both sCECT and input images (NCCT or VNC) in the mediastinal window (window width, 350 HU; level, 50 HU), each relative to the corresponding CECT images. We only included axial slices between the top of the aortic arch and the diaphragm for image similarity analysis.

Image analysis: performance in depicting mediastinal lymph nodes

To explore the clinical utility of sCECT images, we evaluated the performance of sCECT in depicting mediastinal lymph nodes using test set 2. As a quantitative analysis, we measured the lesion CNR of the mediastinal lymph nodes. For each lesion, the measurement was performed on the axial slice of the contrast-enhanced CT, where the short-axis diameter was measured. We first drew a circular region of interest (ROI) inside the lesion, measuring 90% of the lesion’s short-axis diameter. Circular ROIs of the same size were additionally drawn inside the descending thoracic aorta and subcutaneous fat of the bilateral chest wall. The ROIs were then copied to the same locations on the non-contrast and synthetic contrast-enhanced axial images. The contrast-to-noise ratio (CNR) of all lesions was calculated as follows:

$$ {\text{Background }}\,{\text{noise}} = \sqrt {\frac{{{\text{SD}}_{{{\text{right}}\,{\text{fat}}}}^{2} + {\text{SD}}_{{{\text{left }}\,{\text{fat}}}}^{2} }}{2}} $$
$$ {\text{CNR}}_{{{\text{lesion}}}} = \frac{{\left| {{\text{HU}}_{{{\text{DTA}}}} - {\text{HU}}_{{{\text{lesion}}}} } \right|}}{{{\text{Background}}\,{\text{noise}}}} $$

where HU is the mean HU value of the ROI, SD is its standard deviation, and DTA is descending thoracic aorta.

For the qualitative analysis, two blinded board-certified radiologists (Y.J.C. and S.B.L. with 8 and 3 years of experience, respectively) participated in a three-session review of CT images with two-week intervals using a Digital Imaging and Communications in Medicine viewer (RadiAnt, version 2020.1; Medixant, Poznan, Poland). Each session consisted of NCCT, NCCT with sCECT, and CECT images, respectively, from each patient in test set 2 presented in random order. The reviewers were instructed to label mediastinal lymph nodes with a short-axis diameter > 5 mm and report lesion conspicuity on a 4-point scale (1, barely perceptible with presence debatable; 2, subtle finding but likely a lesion; 3, definite lesion detected; and 4, strikingly evident and easily detected)20. The conspicuity of undetected lesions was recorded as 0.

Statistical analysis

For comparison of image similarity metrics and lesion CNR, we applied the paired t-test or the Wilcoxon signed-rank test according to the Shapiro–Wilk normality test. For the observer study, the detection rate of the lymph nodes was compared using the McNemar test and the differences in lesion conspicuity were evaluated using the Wilcoxon signed-rank test. Also, we evaluated lesion localization using the figures of merit (FOM) from jackknife alternative free-response receiver operating characteristic (JAFROC) analysis21. We report the results from the random reader, fixed case JAFROC analysis because of the small number of cases of our study. P < 0.05 was considered indicative of a statistically significant difference. All data were analyzed using MedCalc (version 12.7, MedCalc Software, Ostend, Belgium), scikit-learn library (version 0.20.3, https://scikit-learn.org/), and JAFROC software for Windows (version 4.2.1, WindowsJafroc, https://www.devchakraborty.com).

Informed consent

This retrospective study was approved by the institutional review boards, which waived the need for patient informed consent.

Results

Patient characteristics

Patient characteristics and CT acquisition parameters are summarized in Table 1. A total of 62 patients (35 men, 27 women) with a median age of 67.5 years (interquartile range [IQR], 58–74 years) were enrolled. The development set comprised 25 patients (median age, 66 years [IQR, 53–79]; 13 men, 12 women). For the test set 1, among 42 patients who underwent thoracic CT angiography on one of four different CT scanners (Somatom Force and Somatom Definition, Siemens, Erlangen, Germany; IQon and iCT 256, Philips, Andover, Massachusetts) at Hospital #2, 17 patients were excluded due to motion artifacts (n = 9) and suboptimal contrast opacification (n = 8). The test set 1 included 25 patients (median age, 66 years [IQR, 58.5–72]; 14 men, 11 women). Among them, 18 patients, whose CT vendor was the same as that in the development set, were included in the test set 1A, and test set 1B consisted of the seven remaining patients. For test set 2, among 35 patients who underwent pre-bronchoscopic CT at Hospital #2, 23 patients without significant mediastinal lymphadenopathy were excluded. Thus, test set 2 comprised 12 patients (median age, 70.5 years [IQR, 67–76]; 8 men, 4 women) with a total of 55 mediastinal lymph nodes (mean short-axis diameter, 8.62 ± 2.47 mm).

Table 1 Patient characteristics and CT acquisition parameters.

Technical evaluation

Examples of representative cases from the tuning set and test set 1 are shown in Figs. 2 and 3, respectively. The sCECT images showed significantly higher similarity to the ground-truth CECT than NCCT in all quantitative metrics in both the tuning set and test set 1 (Fig. 4, Table 2). In the tuning set, the sCECT images showed a lower median MAE (33.19 [IQR, 29.24–34.53] vs 34.64 [IQR, 30.73–44.67]; P < 0.001), a higher median PSNR (25.84 [IQR, 25.22–26.70] vs 18.72 [IQR, 18.16–19.85]; P < 0.001), higher median MS-SSIM (0.97 [IQR, 0.96–0.98] vs 0.91 [IQR, 0.88–0.92]; P < 0.001), and lower median LPIPS (0.04 [IQR, 0.04–0.05] vs 0.09 [IQR, 0.07–0.10]; P < 0.001) than NCCT images. In test set 1, sCECT had a lower median MAE (41.72 [IQR, 37.36–46.90] vs 48.74 [IQR, 39.73–54.48]; P < 0.001), higher median PSNR (17.44 [IQR, 16.37–18.60] vs 15.97 [IQR, 14.79–17.19]; P < 0.001), higher median MS-SSIM (0.84 [IQR, 0.79–0.86] vs 0.81 [IQR, 0.76–0.84]; P < 0.001), and lower median LPIPS (0.14 [IQR, 0.12–0.16] vs 0.15 [IQR, 0.13–0.18]; P < 0.001) than NCCT. The findings were also similar in subsets of test set 1.

Figure 2
figure 2

Images of a 63-year-old woman with right pneumothorax from the tuning set are presented. Non-contrast CT (A), synthetic contrast-enhanced CT (B), and contrast-enhanced CT (C).

Figure 3
figure 3

Images of a 65-year-old woman with right pleural effusion from test set 1 are presented. Non-contrast CT (A), synthetic contrast-enhanced CT (B), and contrast-enhanced CT (C).

Figure 4
figure 4

Comparison of image similarity metrics between non-contrast CT (NCCT) and synthetic contrast-enhanced CT (sCECT) with contrast-enhanced CT as the ground truth. Mean absolute error (MAE) (A), peak signal-to-noise ratio (PSNR) (B), multiscale structural similarity index measurement (MS-SSIM) (C), and learned perceptual image patch similarity metric (LPIPS) (D). Lower MAE, higher PSNR, higher MS-SSIM, and lower LPIPS values indicate higher image similarity. All comparisons showed significant differences (P < 0.05).

Table 2 Evaluation of quantitative similarity metrics.

Performance in depicting mediastinal lymph nodes

An example of a representative case from test set 2 is shown in Fig. 5. The median of the lesion CNR of the mediastinal lymph nodes and background noise in each node measurement (n = 55) in test set 2 calculated on CECT images were 4.60 (IQR, 3.79–5.79) and 18.95 (IQR, 16.66–20.92), respectively. The median lesion CNR in the sCECT group was higher than that in the NCCT group (5.00 [IQR, 1.97–10.25] vs 0.52 [IQR, 0.14–1.03]; P < 0.001), while the median background noise in the sCECT group was also higher than that in the NCCT group (18.88 [IQR, 17.38–21.56] vs 17.79 [IQR, 16.04–17.79]; P < 0.001). We did not statistically compare measurements between sCECT and CECT images because of the difference in degrees of contrast enhancement between the development set and test set 2 due to the CT protocols.

Figure 5
figure 5

Images of a 78-year-old man with lung cancer and multiple mediastinal lymph node metastases from test set 2. Non-contrast CT (A, D), synthetic contrast-enhanced CT (B, E), and contrast-enhanced CT (C, F). Hilar lymph nodes (arrows), which are clearly visible on contrast-enhanced CT, are better distinguished from adjacent pulmonary vessels (arrowheads) on synthetic contrast-enhanced CT than on non-contrast CT.

In the observer study on test set 2, both reviewers detected a higher number of lymph nodes on NCCT with sCECT than on NCCT alone (reviewer 1, 76% [42 of 55 nodes] vs 49% [27 of 55 nodes], P = 0.003; reviewer 2, 38% [21 of 55 nodes] vs 29% [16 of 55 nodes], P = 0.06). The reader-averaged JAFROC FOMs calculated from NCCT alone, NCCT with sCECT, and CECT were 0.48, 0.52, and 0.68, respectively. There was no significant difference in JAFROC FOMs between the modalities (P = 0.059). The FROC curves from the three modalities are shown in Supplementary Fig. S2 Both reviewers had a higher lesion conspicuity rating for NCCT with sCECT compared to NCCT alone (P ≤ 0.001 for both), and both also rated CECT images higher in comparison to images of the other two groups (P < 0.001 for both; Fig. 6, Supplementary Table S1).

Figure 6
figure 6

Comparison of radiologists’ ratings regarding lesion conspicuity of mediastinal lymph nodes in test set 2 between non-contrast CT (NCCT), non-contrast CT with synthetic contrast-enhanced CT (NCCT + sCECT), and contrast-enhanced CT (CECT). Undetected lesions are labeled as 0 and indicated by hatched bars.

Discussion

This study demonstrated the technical feasibility of deep learning-based synthetic contrast-enhanced CT (sCECT) in chest CT and evaluated the performance of this approach in depicting mediastinal lymph nodes. In patients with mediastinal lymphadenopathy, sCECT demonstrated a higher contrast-to-noise ratio of lymph nodes (6.15 vs 0.74; P < 0.001) than non-contrast CT (NCCT). In an observer study on the same patients, radiologists detected more lymph nodes (reviewer 1, 76% [42 of 55 nodes] vs 49% [27 of 55 nodes], P = 0.003; reviewer 2, 38% [21 of 55 nodes] vs 29% [16 of 55 nodes], P = 0.06) with higher lesion conspicuity (P ≤ 0.001) on NCCT with sCECT than on NCCT alone. The reader-averaged JAFROC FOMs calculated from NCCT alone, NCCT with sCECT, and CECT were 0.48, 0.52, and 0.68, respectively. There was no significant difference in JAFROC FOMs between the modalities (P = 0.059).

The most important strength of the current study is that we performed technical validation on a heterogeneous test set of CT data, including various CT vendors and scanning parameters. Many studies have shown deep learning applications of image-to-image synthesis in radiology, including cross-modality synthesis and reconstruction, but reports on external data are rare3. We believe that the quantitative performance of the proposed model shows the potential for generalizability, which is essential for any deep learning model to be used in clinical practice22.

Few previous studies have applied deep learning for synthetic contrast enhancement in CT. Santini et al.5 demonstrated synthetic enhancement in non-contrast cardiac CT to delineate the left cardiac chambers. Liu et al.23 proposed a deep learning model to generate synthetic enhancement of major arteries in non-contrast abdominopelvic CT. However, to our knowledge, there are no previous studies that have performed end-to-end conversion of a whole volume of NCCT into sCECT images. We believe that acquiring VNC CT in the development set played a crucial role in the successful training of the proposed model. Misalignment between non-contrast and ground-truth contrast-enhanced images is an obstacle in the development of synthetic contrast enhancement24,25. The VNC reconstruction of dual-energy CT enabled perfect spatial registration between the input and ground truth.

The observer study performed by two radiologists showed that the mediastinal lymph nodes were more conspicuous on sCECT than on NCCT, which can be attributed to the higher CNR of the lymph nodes. However, only one radiologist showed a statistically significant increase in the detection rate on sCECT images compared to NCCT images. The trained model relatively poorly delineated hilar and segmental lymph nodes adjacent to pulmonary vessels that are often difficult to detect on NCCT. Further training on a more heterogeneous group of patients with mediastinal lymphadenopathy may improve the model’s performance. Nonetheless, the proposed model successfully generated sCECT images with higher CNR in terms of technical feasibility.

Importantly, we do not claim that our deep learning implementation or methods to generate sCECT can replace CECT. The ultimate goal of our study on sCECT is to yield additional information, including improved lesion conspicuity and detectability, from NCCT, but not to predict the degree or pattern of contrast enhancement of the lesions. Not only does a vast majority of chest CT not require the use of contrast media, but also sCECT has a potential benefit in patients under certain conditions. These include allergy to iodinated contrast media, frequent CT examinations, chronic kidney disease, and poor vascular access. Additionally, we believe that sCECT can be utilized as a type of post-processing technique. A future application of sCECT is its use in automated volumetric segmentation and analysis. A previous study used synthetic non-contrast CT to improve the generalizability of CT segmentation tasks26. Likewise, sCECT may enable segmentation tools based on CECT to be generalized to NCCT data.

Our study has several limitations. First, our study included a small number of patients. However, such a number is reasonable as generative models demand high computational loads, unlike classification models. Several previous studies on image synthesis in radiology were also based on small study populations18,27. Second, we could not strictly control CT protocols and indications because of the retrospective nature of our study. The ideal training and test sets might have been patients with similar diseases and similar CT protocols. However, dual-energy CT with routine contrast amount or CT angiography for suspected lung cancer is not commonly performed in clinical practice. Lastly, our proposed model may not be an optimal deep learning approach for sCECT. Comparison and combination with different approaches including CNN (e.g., U-Net7) and generative models based on unpaired data (e.g., CycleGAN28) are warranted.

In conclusion, we implemented a deep learning model for generating synthetic contrast enhancement from non-contrast chest CT. Synthetic contrast-enhanced CT demonstrated good quantitative performance in terms of image similarity metrics and improved depiction of mediastinal lymph nodes. Applying the proposed deep learning model in clinical practice requires further studies on a larger population with more heterogeneous diseases.