Abstract
Deep learning techniques have promoted the rise of artificial intelligence (AI) and performed well in computer vision. Medical image analysis is an important application of deep learning, which is expected to greatly reduce the workload of doctors, contributing to more sustainable health systems. However, most current AI methods for medical image analysis are based on supervised learning, which requires a lot of annotated data. The number of medical images available is usually small and the acquisition of medical image annotations is an expensive process. Generative adversarial network (GAN), an unsupervised method that has become very popular in recent years, can simulate the distribution of real data and reconstruct approximate real data. GAN opens some exciting new ways for medical image generation, expanding the number of medical images available for deep learning methods. Generated data can solve the problem of insufficient data or imbalanced data categories. Adversarial training is another contribution of GAN to medical imaging that has been applied to many tasks, such as classification, segmentation, or detection. This paper investigates the research status of GAN in medical images and analyzes several GAN methods commonly applied in this area. The study addresses GAN application for both medical image synthesis and adversarial learning for other medical image tasks. The open challenges and future research directions are also discussed.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Deep learning has dominated the field of computer vision since 2012 [1], taking advantage of the huge improvement in data storage and computing power of modern processing devices. Currently, most advanced methods for computer vision are based on deep learning. In this context, medical image analysis is an important research direction. The advantage of deep learning network is the ability to automatically extract features [2], researchers can describe medical images without constructing complex manual features. In deep learning-based medical image analysis, the end-to-end network training method shows significant advantages. Moreover, medical image analysis has huge practical demand and market space. It can be reasonably predicted that deep learning-based medical image analysis has great potential for research in the near future.
Most current artificial intelligence (AI) methods and applications belong to the category of supervised learning [3], which in this case means medical image data must be labeled. This is very difficult and costly to achieve in practice. On one hand, each medical image implies in principle a patient behind it, so the amount of medical image data available is very limited. On the other hand, medical image labeling requires highly specialized medical staff and plenty of time. For example, to train a deep convolutional neural network (CNN) for tumor segmentation, it is necessary for a specialized physician to mark all tumor pixels in the training image. These problems greatly restrict the development of automated, intelligent medical image analysis tools. It is of high potential that Generative adversarial network (GAN) has the potential to provide efficient solutions to these problems.
GAN was proposed in 2014 [4], with the original intention of imitating real data. GAN consists of two subnetworks: generator (\(G\)) and discriminator (\(D\)). During training, \(G\) is used to generate data with a given (expected) distribution, whereas \(D\) is used to determine whether generated data are real or fake. The two are trained alternately, and improve together [5]. Eventually, a G is obtained that can generate data close to the real data distribution, which is the ultimate goal of the method. Obviously, if it is applied to medical imaging, it can expand datasets with insufficient amounts of medical image data so that deep learning methods can be used together with the expanded datasets. Another very useful feature of GAN for medical image analysis is its adversarial training strategy, which can be applied to image segmentation, detection, or classification.
Compared with other medical image analysis techniques [6], GAN is still in its infancy and the number of related works available in the literature is relatively small, but it has huge potential. The application of GAN to medical images began in 2016, when only an article on the topic was published [7]. Since 2017, there have been more relevant studies, so the articles about GAN in medical images in the past five years have been analyzed and summarized in terms of application direction, methods, and other aspects. The rest of this article is organized as follows (Fig. 1). In the second section, GAN methods commonly applied in the medical image field are described in detail, focusing on their technical characteristics. The third section addresses the main application of GAN in this context, namely medical image synthesis. A classification is proposed according to different conditions of generation. The fourth section analyzes the application of GAN in medical image data enhancement. In the fifth section, GAN is discussed as a semi-supervised learning method, which mainly operates through feature and annotations sharing. The sixth section describes the functions of GAN that can be extended to other medical tasks. The seventh section discusses technical and non-technical challenges and directions. Finally, the conclusions are summarized in the eighth section.
2 GAN technology
This section starts with the original GAN and then covers the evolution process of GAN when used for image generation. The methods considered are also frequently used in the specific field of medical images. The section emphasizes the overall architecture, data flow, and objective function of GAN, and does not address the network details of specific generators or discriminators.
2.1 Original GAN
The operation of the original GAN is shown in Fig. 2a, where the symbols \(G\) and \(D\) denote neural networks. The input of \(G\) is a random noise vector \(z\), which is sampled from the distributed \(p(z)\). Generally, in order to keep consistency and convenience of training, the symbol \(p(z)\) adopts either a Gaussian or a uniform distribution. It should be noted that \(z\) is a low-dimensional vector, whereas images in actual applications are high-dimensional data, so \(G\) learns the map** from a low-dimensional noise space to a high-dimensional real data space. The inputs to \(D\) include \(G(z)\), generated fake data, and \(X\), real sample data used to balance training data. The symbol \(D\) is a classifier whose purpose is to judge the truth or falsehood of data. The purpose of \(G\) is to produce data as close to the real ones as possible, confusing \(D\) so that it cannot distinguish which ones are real and which ones are fake. In this way, \(G\) and \(D\) take part in a dynamic game process, improving each other during training. Data generated by \(G\) will be more and more realistic, and the recognition rate of \(D\) will gradually decrease from an initial value equal (or close) to 1 (perfect discrimination of real and false data) to (optimally) 0 (fake data cannot be distinguished from real ones). The optimization functions for \(D\) and \(G\) are as follows:
where \(L_{D}\) and \(L_{G}\) represent \(D\) and \(G\) loss functions, respectively, \(D(x)\) is close to 1, because X are real data,\(D(G(z))\) gradually decreases, and the optimization process consist in maximizing \(L_{D}\) and minimizing \({\text{L}}_{{\text{G}}}\).
2.2 DCGAN
The original GAN is not actually used to generate images, because \(G\) and \(D\) are ordinary fully connected networks not suitable for images. Image data distribution is very complex and has high dimensions, which is not easy to achieve. CNNs are more image-friendly than fully connected networks, and deep convolutional generative adversarial networks (DCGAN) have successfully combined CNNs with GAN, resulting in a more suitable solution for image generation [8]. DCGAN also adopts the structure shown in Fig. 2a, except that \(G\) and \(D\) are both replaced by CNNs. GAN has the problem of mode collapse, in other words, the training process is not stable, and generated images may only belong to a few fixed categories, or some strange images may appear. DCGAN proposes a series of techniques to balance the training process. \(G\) and \(D\) are fully convolutional networks (FCNs, i.e., CNNs without fully connected layers), using strided convolution instead of pooling layer for down-sampling. The output layer of \(G\) and the input layer of \(D\) use batch normalization [9], a data normalization layer that can be embedded in the network to accelerate learning and convergence. In DCGAN, activation functions are also changed in \(D\) with regard to GAN. GAN uses the ReLU activation function [10] (Fig. 3a) for both \(G\) and \(D\), whereas DCGAN uses ReLU for \(G\) and LeakyReLU [11] (Fig. 3b) for \(D\), to prevent gradient sparsity. In addition, the activation function of the output layer of \(D\) is tanh.
2.3 CGAN
GAN uses a random noise vector with a very low dimension to generate high-dimensional image data. This modeling method has too many degrees of freedom. If the noise signal has only hundreds of dimensions but the generated image has thousands of pixels, then controllability will be very poor. Conditional generative adversarial networks (CGAN, Fig. 2b) increase controllability by adding a constraint c to data [12], which is part of the input layer of both \(G\) and \(D\), guiding data generation. The objective function of CGAN is
where \({\text{c}}\) can be a label, tags, data from different modes, or even an image. For example, the prior condition (see Sect. 3.3) used by Pix2pix [13] is segmentation image or contour image, Pix2pix can complete the transformation from image to image. When the prior condition is an image, a loss between conditional and generated images is usually added, so that the generated image can have higher authenticity [13]. InfoGAN [14] can also be viewed as a special kind of CGAN. Different from CGAN, it tries to add constraints in random noise \(z\) and uses regularization terms based on mutual information. As the input of the network, the symbol \(z\) controls the image generation. For instance, in the MNIST dataset [15], \(z\) controls the thickness, slope, and other characteristics of the generated numbers.
2.4 CycleGAN
Pix2pix requires paired images, one of them annotated, which requires a lot of time and implies a high cost. In contrast, CycleGAN [16] proposes a ring closed network consisting of two generators and two discriminators (Fig. 2c), which performs the conversion between two image domains without the need of paired images. Because of the two generators and discriminators, the overall structure and data flow are more complex than in the previous methods. The symbols \(G_{B}\) and \(G_{A}\) perform the transformation from domain A to domain B and from domain B to domain A, respectively, so they are equivalent to two reciprocal map**s. The symbol \(G_{B}\) generates images \(X_{{{\text{fB}}}}\) with domain B characteristics from images \(X_{A}\) of domain A, whereas \(G_{A}\) generates images \(X_{{{\text{fA}}}}\) with domain A characteristics from images \(X_{B}\) of domain B. Discriminators \(D_{A}\) and \(D_{B}\) identify images of domains A and B, respectively. The objective function of CycleGAN can be written as:
where \(L_{GAN}\) is a regular generator loss, as described by Eq. (1). Real data return to its original domain after a loop, so \(L_{cyc}\) represents the loss of real data and its cyclic data. \(\lambda\) is a coefficient used to balance generator loss and cycle loss.
Since it is easy for GAN to be unbalanced in training, the two generators and discriminators in CycleGAN need to be carefully balanced during training. The use of paired images is equivalent to a feature filtering, and GAN can easily learn which parts of images need to be converted. However, the training process requires huge amounts of data when working with unpaired images, like in the case of CycleGAN.
2.5 LAPGAN
Humans usually paint a picture with multiple strokes, so machines can create images by multiple steps. That is where the idea of LAPGAN [17] comes from. There is no need to complete all GAN tasks at once, but one at a time generating a full image in several steps. Figure 2d shows a three-stage LAPGAN, the red arrows representing down-sampling and the blue arrows representing up-sampling. The three down-sampling processes can be regarded as a three-layer Laplace pyramid, and an independent conditional GAN model is trained at each level. Using the multi-scale structure of natural images, a series of generative models are constructed, each one capturing a specific scale image structure of the pyramid. The training process is carried out from left to right. The original image \(X_{r1}\) is transformed into \(X_{r1}^{^{\prime}}\) through down-sampling, and \(X_{r1}^{^{\prime}}\) becomes \(X_{r1}^{^{\prime\prime}}\) through up-sampling. Then a residual image is obtained by comparing \(X_{r1}\) with \(X_{r1}^{^{\prime\prime}}\). \(G_{1}\) takes a noise signal \(z_{1}\) as input and \(X_{r1}^{^{\prime\prime}}\) as the condition to generate the residual image. Training in the remaining levels is similar. The LAPGAN test process is shown in Fig. 2e. In this case it is performed from right to left. It is important to note that the target of \(G\) is the residual image, so there is a summation process. Serialization and the use of residual images are the two LAPGAN characteristics that effectively reduce the content and difficulty that GAN needs to learn.
3 Medical image synthesis
The most successful application of GAN in medical image analysis to date is medical image synthesis, which can alleviate the problems of insufficient medical images available or imbalanced data categories [18, 19]. Traditional data enhancement techniques include image cutting, flip**, and symmetry, among others. Obviously, these techniques can only change data in direction or size, but no new data are generated, whereas GAN can generate completely new data. In this section, unconditional synthesis, domain transformation and other conditional synthesis methods are described according to different conditions of medical images. Figure 4 shows some examples of these applications.
Medical image synthesis examples. Unconditional synthesis of brain magnetic resonant images (MRI) [20] a and of skin lesions [51], the generating condition is set as real image. GAN's role is to add different disease features into healthy images to generate realistic CXR images with different diseases. Lejmer et al. [59]. Chen et al. [60] proposed a multistage dense connection network with GAN for 3D brain MRI super-resolution reconstruction. The generator convolutional layers are all connected in a dense manner, whose main advantage is high speed. Irina et al. [61] used a combination of least squares dual loss and image gradient as loss function for the generator in GAN 3D super-resolution reconstruction, improving the quality of generated images.
4.2 Denoising
Noise in medical images seriously affects the diagnostic accuracy of doctors. This problem can be alleviated by GAN image denoising capabilities. In CT images, since high doses can harm the patient's health, the past decade has seen a trend towards dose reduction in CT examinations, at the expense of noise appearing in the low-dose images. Yang et al. [62] proposed a GAN with Wasserstein distance and perceptual similarity, which suppresses noise by comparing the perceptual features of a denoised output against those of the ground truth in a given feature space. Wolterink et al. [63] compared three training strategies, namely voxel loss, combined voxelwise and adversarial loss, and adversarial loss. Choi et al. [64] considered the statistical characteristics of CT images and introduced a loss function to incorporate the noise property in the image domain derived from noise statistics in the sinogram domain.
In addition to the use of lower doses, operation equipment (e.g., portable) may also introduce noise. Zhou et al. [65] constructed a two-stage GAN to improve the quality of ultrasonic images and reduce noise. In the training process, a transmission learning method based on plane wave image (PWI) data was introduced to facilitate convergence and eliminate the influence of deformation caused by respiratory activity. Chen et al. [66] proposed an unsupervised learning framework for high-quality pixel-level smoke detection and removal. The detection network is regarded as a prior knowledge and a loss function is used to support the training of smoke removal network.
4.3 Reconstruction
MRI is a widely used clinical medical imaging method, but one of its main disadvantages is the long acquisition time. During MRI imaging, data samples are not collected directly in the image space, but in the k space. The k space contains spatial frequency information obtained row by row and at any position. Slow acquisition causes interferences that may reduce image quality, due for instance to patient movements, such as heart beats or breathing. Compressive sensor-based imaging provides a solution to accelerate the acquisition of MRI images by reconstructing them from a small part of k space. In theory, assuming that the original data can be compressed, the reconstruction can be performed through nonlinear optimization of random under-sampled original data. GAN-based MRI image reconstruction is based on this theory and can be summarized as follows. The generator consists of multiple end-to-end networks. The first one converts a zero-fill reconstructed image into a complete reconstructed image. The following refinement network improves the accuracy of the reconstructed image. Then a discriminator network assesses whether or not the reconstruction is accurate. The works reported in [67, 68, 69 ] are all based on this framework, whose structure is shown in Fig. 4, the difference being the loss functions used. In order to improve the perceived quality of reconstruction [67], content loss is designed for generator training. This loss includes three parts: pixel mean square error loss, frequency domain mean squared error loss, and VGG loss. Feature matching loss and penalty are added in [68]. The work in [69] adds cycle loss, which is a cycle combination of low sampling frequency and completely reconstructed images.
4.4 Registration
To get accurate pathological information in the process of medical diagnosis, a set of images is taken of the same body part, so it is usually necessary to conduct quantitative analyses of several different images at the same time. These images need to be strictly aligned, which is called image registration. It requires a spatial transformation of images, so that there is spatial consistency between corresponding points in several images. In [70], a constrained CNN replaced heuristic smoothness measures of displacement fields, the generator is the registration network and the discriminator distinguishes the dense displacement field predicted by the generator from motion data simulated with the finite element method. During training, the registration network maximizes the similarity between anatomical labels and minimizes the difference between measured and simulated deformation. The generator in [71] generates conversion parameters between fixed and moving images. Different from [70], the discriminator is not used to assess conversion parameters, but to determine whether or not the processed moving image has completed registration. The work in [72] used CGAN for multimodal registration. By adding appropriate terms into the loss function of image generation, the generated output image has the same features as the moving image. Christine et al. [85] proposed a GAN to synthesize 3D CT scan images from X-ray ones, and then used a multi-organ segmentation network for segmentation. Konstantinos et al. [86] proposed a domain adaptive multi-connected adversarial network, where different data types are treated as different domains, making features learnt by segmentation independent of domain-specific factors. Good adaptability was shown with two different brain MRI databases. In [87], also from the point of view of using domains to solve data inconsistent segmentation problems, a network that migrates specific image styles was used. An unannotated color fundus image dataset was changed to annotated dataset style. In this way, the segmentation network trained by annotated datasets can be used to segment unannotated images.
6 Function expansion of GAN
6.1 Extended generator and discriminator
The adversarial learning process of the generator and discriminator produces a large number of advanced semantic features that can be extended to other tasks. The applicability of extended generators and discriminators is not limited, respectively, to image synthesis and to the classification of true and fake images.
Das et al. [88] proposed a generalizable classifier using adversarial learning between generator and discriminator to predict progressive retinal diseases such as age-related macular degeneration and diabetic macular edema. Gu et al. [89] proposed a transfer recurrent feature learning framework for probe-based confocal laser endomicroscopy (pCLE) video classification tasks. In a first phase, the discriminator features of pCLE images are learnt by GAN. In a second phase, discriminator features are applied to a recurrent neural network (RNN) to distinguish between true and false data and lesion grade. It can be seen that the discriminator is mainly expanded into a multiclass classifier.
Some researchers suggested using generators to segment images [87, 7.2 Non-technology challenges and directions 1) Privacy. The collection of medical images for scientific research requires patient consent. It is not clear if generated images or dataset generated based on them are to be considered as original data or new data, and therefore whether they should be subject to patient consent or not. The legality of new data is also uncertain. Some applications of GAN, such as domain transformation, may even expose more patients' personal privacy than original images. Therefore, for the application of new technology, not only its feasibility but also ethics and law must be considered. 2) Image confidence. In the field of medical imaging, the interpretation of an image may affect the life of the patient, so many technologies that are good in other areas for similar purposes may not be applicable in this medical field. Sometimes even a normal medical image will not be given enough trust by doctors, and multi-level detection is still needed. In this context, currently there is no reason for doctors to give trust to images generated by GAN. Cohen et al. [107] questioned the medical images generated by GAN, which may misjudge the medical condition of patients. They trained a CycleGAN to convert normal brain MRI images to brain MRI images with tumors. In fact, the images generated by their network are visually realistic, but without tumors. There are many reasons behind this. For instance, the generalization performance of a well-trained model is not good, or the transformations between some data domains cannot be accurately carried out. Attention should be paid to this issue, but this does not mean that all GANs will lead to misdiagnosis. 3) Datasets. Although there are many publicly available datasets, most of them were created not for use with GAN, but for other medical tasks. The quality of existing medical datasets is spotty, and some are old and scattered. For some tasks, such as the transformation between MRI and CT images, it is difficult to find relevant images of a certain scale. Most researchers collect them by themselves through hospitals.
8 Conclusion
Oriented to GAN for medical imaging, this paper summarizes commonly used GAN methods, medical image synthesis and the function of adversarial learning in other medical image tasks. The relevant papers in the area published in the last five years are reviewed. The challenges of datasets, training methods, reliability, and legality are pointed out. Future directions of unsupervised learning, breakthroughs in clinical needs, and the need for GANs more suitable for medical imaging are also discussed. In general, the existing medical image synthesis technology has a high reliability, and the combination of GAN and other medical image models also produces a good effect. It can be clearly concluded that GAN has great potential and development perspectives in medical imaging. In fact, the whole development trend of artificial intelligence is towards unsupervised (deep) learning.
References
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst. 2012. https://doi.org/10.1145/3065386.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436.
Jiang Y, Yin S, Dong J, Kaynak O. A Review on soft sensors for monitoring, control and optimization of industrial processes. IEEE Sens J. 2021;21(11):12868–81.
Goodfellow I, Pouget-Abadie J, Mirza M et al. Generative adversarial nets. In: Advances in Neural Information Processing Systems. 2014. p. 2672–80.
Creswell A, White T, Dumoulin V, et al. Generative adversarial networks: an overview. IEEE Signal Process Mag. 2018;35(1):53–65.
Liu S, Li X, Jiang Y, Luo H, Gao Y, Yin S. Integrated learning approach based on fused segmentation information for skeletal fluorosis diagnosis and severity grading. IEEE Trans Ind Inf. 2021;17(11):7554–63.
Yi X, Walia E, Babyn P. Generative adversarial network in medical imaging: a review. Medl Image Anal. 2019;58:
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: International conference on image and graphics. 2017. p. 97–108.
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. 2015. p. 448–56.
Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. 2011. p. 315–23.
He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE international conference on computer vision. 2015. p. 1026–34.
Mirza M, Osindero S. Conditional generative adversarial nets. Comput Sci 2014;2672–80.
Isola P, Zhu J-Y, Zhou T, Efros AA. Image-to-image translation with conditional adversarial networks. In: IEEE conference on computer vision and pattern recognition. 2017. p. 1125–34.
Chen X, Duan Y, Houthooft R et al. Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: International conference on neural information processing systems. 2016. p. 2172–80.
LeCun Y, Cortes C, Burges ChJC. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/.
Zhu J-Y, Park T, Isola P, Efros AA. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision. 2017. p. 2223–32.
Denton EL, Chintala S, Fergus R, et al. Deep generative image models using a laplacian pyramid of adversarial networks. In: International conference on neural information processing systems. 2015. p. 1486–94.
Li K, Luo H, Yin S, et al. A novel bias-eliminated subspace identification approach for closed-loop systems. IEEE Trans Ind Electron. 2020;68(6):5197–205.
Luo H, Li K, Kaynak O, et al. A robust data-driven fault detection approach for rolling mills with unknown roll eccentricity. IEEE Trans Control Syst Technol. 2019;28(6):2641–8.
Bermudez C, Plassard AJ, Davis LT, et al. Learning implicit brain MRI manifolds with deep learning. Int Soc Optics Photon. 2018;10574:105741L.
Beers A, Brown J, Chang K et al. High-resolution medical image synthesis usingprogressively grown generative adversarial networks; 2018. ar**v preprint ar**v:1805.03144.
BenTaieb A, Hamarneh G. Adversarial stain transfer for histopathology image analysis. IEEE Trans Med Imaging. 2017;37(3):792–802.
Wolterink JM, Dinkla AM, Savenije MH, Seevinck PR, van den Berg CA, Sgum IIˇ. Deep MR to CT synthesis using unpaired data. In: International workshop on simulation and synthesis in medical imaging. 2017. p. 14–23.
Ben-Cohen A, Klang E, Raskin SP, Amitai MM, Greenspan H. Virtual PET images from CT data using deep convolutional networks: initial results. In: International workshop on simulation and synthesis in medical imaging. 2017. p. 49–57.
Zhao H, Li H, Maurer-Stroh S, Cheng L. Synthesizing retinal and neuronal images with generative adversarial nets. Med Image Anal. 2018;49:14–26.
Wolterink JM, Leiner T, Isgum I. Blood vessel geometry synthesis using generative adversarial networks; 2018. ar**v preprint ar**v:1804.04381.
Han C, Hayashi H, Rundo L et al. GAN-based synthetic brain MR image generation. International symposium on biomedical imaging. 2018. p. 734–8.
Calimeri F, Marzullo A, Stamile C, Terracina G. Biomedical data augmentation using generative adversarial neural networks. In: International conference on artificial neural networks. Springer; 2017. pp. 626–634.
Salehinejad H, Valaee S, Dowdell T, Colak E, Barfett J. Generalization of deep neural networks for chest pathology classification in x-rays using generative adversarial networks. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE; 2018. pp. 990–994, IEEE.
Kitchen A, Seah J. Deep generative adversarial neural networks for realistic prostate lesion mri synthesis; 2017. ar**v preprint ar**v:1708.00129.
Chuquicusma MJ, Hussein S, Burt J, Bagci U. How to fool radiologists with generative adversarial networks? A visual Turing test for lung cancer diagnosis. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). IEEE; 2018. pp. 240–244.
Baur C, Albarqouni S, Navab N. Melanogans: high resolution skin lesion synthesis with gans; 2018. ar**v preprint ar**v:1804.04338.
Baur C, Albarqouni S, Navab N. Generating highly realistic images of skin lesions with GANs. OR 2.0 context-aware operating theaters, computer assisted robotic endoscopy, clinical image-based procedures, and skin image analysis. Cham: Springer; 2018. p. 260–267.
Yi X, Walia E, Babyn P. Unsupervised and semi-supervised learning with categorical generative adversarial networks assisted by wasserstein distance for dermoscopy image classification; 2019. ar**v preprint ar**v:1804.03700.
Frid-Adar M, Diamant I, Klang E, et al. Gan-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing. 2018;321:321–31.
Nie D, Trullo R, Lian J, et al. Medical image synthesis with context-aware generative adversarial networks. In: International conference on medical image computing and computer-assisted intervention. Springer; 2017. pp. 417–425.
Zhao M, Wang L, Chen J et al. Craniomaxillofacial bony structures segmentation from MRI with deep-supervision adversarial learning. In: International conference on medical image computing and computer assisted intervention. Springer, 2018. pp. 720–727.
Zhang Z, Yang L, Zheng Y. Translating and segmenting multimodal medical volumes with cycle-and shape-consistency generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2018. pp. 9242–9251.
Chartsias A, Joyce T, Dharmakumar R, Tsaftaris SA. Adversarial image synthesis for unpaired multi-modal cardiac data. In: International workshop on simulation and synthesis in medical imaging. Springer; 2017. pp. 3–13.
Hiasa Y, Otake Y, Takao M, Matsuoka T, Takashima K, Carass A, Prince JL, Sugano N, Sato Y. Cross-modality image synthesis from unpaired data using cyclegan. In: International workshop on simulation and synthesis in medical imaging. Springer; 2018. pp. 31–41.
Bi L, Kim J, Kumar A, Feng D, Fulham M. Synthesis of positron emission tomography (PET) images via multi-channel generative adversarial networks (GANs). In: Molecular imaging, reconstruction and analysis of moving body organs, and stroke imaging and treatment. Springer; 2017. pp. 43–51.
Wei W, Poirion E, Bodini E et al. Learning myelin content in multiple sclerosis from multimodal MRI through adversarial training. In: International conference on medical image computing and computer-assisted intervention. Springer; 2018. pp. 514–522.
Armanious K, Yang C, Fischer M et al. Medgan: medical image translation using gans; 2018. ar**v preprint ar**v:1806.06397.
Ben-Cohen A, Klang E, Raskin SP, et al. Cross-modality synthesis from CT to PET using FCN and GAN networks for improved automated lesion detection. Eng Appl Artif Intell. 2019;78:186–94.
Zanjani FG, Zinger S, Bejnordi BE, van der Laak JA, et al. Histopathology stain-color normalization using deep generative models. In: International conference on medical imaging with deep learning. 2018.
Bayramoglu N, Kaakinen M, Eklund L, Heikkila J. Towards virtual H&E staining of hyperspectral lung histology images using conditional generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision; 2017. pp. 64–71.
Costa P, Galdran A, Meyer MI, et al. End-to-end adversarial retinal image synthesis. IEEE Trans Med Imaging. 2017;37(3):781–91.
Guibas JT, Virdi TS, Li PS. Synthetic medical images from dual generative adversarial networks; 2017. ar**v preprint ar**v:1709.01872.
Hu Y, Gibson E, Lee L-L et al. Freehand ultrasound image simulation with spatially conditioned generative adversarial networks. In: Molecular imaging, reconstruction and analysis of moving body organs, and stroke imaging and treatment. Springer; 2017. pp. 105–115.
Tom F, Sheet D. Simulating patho-realistic ultrasound images using deep generative networks with adversarial learning. In: 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018). IEEE; 2018. pp. 1174–1177.
Mahapatra D, Bozorgtabar B, Thiran J-P, Reyes M. Efficient active learning for image classification and segmentation using a sample selection and conditional generative adversarial network. In: International conference on medical image computing and computer-assisted intervention. Springer; 2018. pp. 580–588.
Olut S, Sahin YH, Demir U, Unal G. Generative adversarial training for MRA image synthesis using multi-contrast MRI. In: International workshop on predictive intelligence in medicine. Springer; 2018. pp. 147–154.
Almalioglu Y, Ozyoruk KB, Gokce A, et al. EndoL2H: deep super-resolution for capsule endoscopy. IEEE Trans Med Imaging. 2020; (99):1–1.
Ma J, Cheng S, Yu J, et al. PathSRGAN: multi-supervised super-resolution for cytopathological images using generative adversarial network. IEEE Trans Med Imaging. 2020; (99):1–1.
Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. International conference on medical image computing and computer-assisted intervention. Cham: Springer; 2015. Pp. 234–241.
Ravì D, Szczotka AB, Pereira SP, Vercauteren T. Adversarial training with cycle consistency for unsupervised super-resolution in endomicroscopy. Med Image Anal. 2019;53:123–31.
You C, Li G, Zhang Y, et al. CT super-resolution GAN constrained by the identical, residual, and cycle learning ensemble (GAN-CIRCLE). IEEE Trans Med Imaging. 2019;39(1):188–203.
Das V, Dandapat S, Bora PK. Unsupervised super-resolution of OCT images using generative adversarial network for improved age-related macular degeneration diagnosis. IEEE Sens J. 2020; (99):1–1.
Li Z, Wang Y, Yu J. Reconstruction of thin-slice medical images using generative adversarial network. In: International workshop on machine learning in medical imaging. Springer; 2017. pp. 325–333.
Chen Y, Shi F, Christodoulou AG et al. Efficient and accurate MRI super-resolution using a generative adversarial network and 3D multi-level densely connected network. In: International conference on medical image computing and computer-assisted intervention. Springer; 2018. pp. 91–99.
Sánchez I, Vilaplana V. Brain MRI super-resolution using 3D generative adversarial networks. International conference on medical imaging with deep learning. 2018.
Yang Q, et al. Low-dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual Loss. IEEE Trans Med Imaging. 2018;37(6):1348–57.
Wolterink JM, et al. Generative adversarial networks for noise reduction in low-dose CT. IEEE Trans Med Imaging. 2017;36(12):2536–45.
Choi K, Lim JS, Kim S. StatNet: statistical image restoration for low-dose CT using deep learning. IEEE J Sel Topics Signal Process. 2020;99:1–1.
Zhou Z, et al. Image quality improvement of hand-held ultrasound devices with a two-stage generative adversarial network. IEEE Trans Biomed Eng. 2019;67(1):298–311.
Chen L, et al. De-smokeGCN: generative cooperative networks for joint surgical smoke detection and removal. IEEE Trans Med Imaging. 2019;39(5):1615–25.
Yang G, Yu S, Dong H, et al. Dagan: deep de-aliasing generative adversarial networks for fast compressed sensing mri reconstruction. IEEE Trans Med Imaging. 2017;37(6):1310–21.
Seitzer M, Yang G, Schlemper J et al. Adversarial and perceptual refinement for compressed sensing MRI reconstruction. In: International conference on medical image computing and computer-assisted intervention. Springer; 2018. pp. 232–240.
Quan TM, Nguyen-Duc T, Jeong WK. Compressed sensing MRI reconstruction with cyclic loss in generative adversarial networks. IEEE Trans Med Imaging. 2017; 99.
Hu Y, Gibson E, Ghavami N et al. Adversarial deformation regularization for training image registration neural networks. In: International conference on medical image computing and computer assisted intervention. Springer; 2018. pp. 774–782.
Yan P, Xu S, Rastinehad AR, Wood BJ. Adversarial image registration with application for MR and TRUS image fusion. In: International workshop on machine learning in medical imaging. Springer; 2018. pp. 197–204.
Mahapatra D, Antony B, Sedai S, Garnavi R. Deformable medical image registration using generative adversarial networks. In: 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018). IEEE; 2018. pp. 1449–1453.
Tanner C, Ozdemir F, Profanter R et al. Generative adversarial networks for MR-CT deformable image registration; 2018. ar**v preprint ar**v:1807.07349.
Jiang Y, Yin S, Kaynak O. Data-driven monitoring and safety control of industrial cyber-physical systems: basics and beyond. IEEE Access. 2018;6:47374–84.
Diaz-Pinto A, et al. Retinal image synthesis and semi-supervised learning for glaucoma assessment. IEEE Trans Med Imaging. 2019;38(9):2211–8.
Salehinejad H, Colak E, Dowdell T, et al. Synthesizing chest X-ray pathology for training deep convolutional neural networks. IEEE Trans Med Imaging. 2019;38(5):1197–206.
Xue Y, et al. Selective synthetic augmentation with HistoGAN for improved histopathology image classification. Med Image Anal. 2021;67: 101816.
Yutong X, et al. Semi-supervised adversarial model for benign–malignant lung nodule classification on chest CT. Med Image Anal. 2019;57:237–48.
Wenguang Yuan A, et al. Unified generative adversarial networks for multimodal segmentation from unpaired 3D medical images. Med Image Anal. 2020;64:101731.
Bo H, et al. Unsupervised learning for cell-level visual representation in histopathology images with generative adversarial networks. IEEE J Biomed Health Inf. 2017;23(3):1316–28.
Wang S, et al. Diabetic retinopathy diagnosis using multichannel generative adversarial network with semisupervision. IEEE Trans Autom Sci Eng. 2020;99:1–12.
Li X, Jiang Y, Yin S. Lightweight attention convolutional neural network for retinal vessel image segmentation. IEEE Trans Ind Inf. 2020;17(3):1958–67.
Gadermayr M, et al. Generative adversarial networks for facilitating stain-independent supervised and unsupervised segmentation: a study on kidney histology. IEEE Trans Med Imaging. 2019;38(10):2293–302.
Chen X, et al. One-shot generative adversarial learning for MRI segmentation of craniomaxillofacial bony structures. IEEE Trans Med Imaging. 2019;99:1–1.
Zhang Y, et al. Unsupervised X-ray image segmentation with task driven generative adversarial networks. Med Image Anal. 2020. https://doi.org/10.1016/j.media.2020.101664.
Zhao H, Li H, Maurer-Stroh S, et al. Supervised segmentation of un-annotated retinal fundus images by synthesis. IEEE Trans Med Imaging. 2019;38(1):46–56.
Huo Y, Xu Z, Bao S et al. Splenomegaly segmentation using global convolutional kernels and conditional generative adversarial networks. Medical Imaging. 2018: Image Processing, vol. 10574, p. 1057409, International Society for Optics and Photonics, 2018.
Das V, Dandapat S, Bora PK. A data-efficient approach for automated classification of OCT images using generative adversarial network. IEEE Sens Lett. 2020;4(1):1–4.
Gu Y, Vyas K, Yang J, Yang G-Z. Transfer recurrent feature learning for endomicroscopy image recognition. IEEE Trans Med Imaging. 2019;38(3):791–801.
Son J, Park SJ, Jung K-H. Retinal vessel segmentation in fundoscopic images with generative adversarial networks; 2017. ar**v preprint ar**v:1706.09318.
Xue Y, Xu T, Zhang H, Long LR, Huang X. Segan: adversarial network with multi-scale l 1 loss for medical image segmentation. Neuroinformatics. 2018;16(3–4):383–92.
Sekuboyina A, Rempfler M, Kukačka et al. Btrfly net: vertebrae labelling with energy-based adversarial learning of local spine prior. In: International conference on medical image computing and computer-assisted intervention. Springer; 2018. p. 649–657.
Wu H, et al. Automated left ventricular segmentation from cardiac magnetic resonance images via adversarial learning with multi-stage pose estimation network and co-discriminator. Med Image Anal. 2021;68:101891.
Lei B, et al. Skin lesion segmentation via generative adversarial networks with dual discriminators. Med Image Anal. 2020;64: 101716.
Rachmadi MF, et al. Automatic spatial estimation of white matter hyperintensities evolution in brain MRI using disease evolution predictor deep neural networks. Med Image Anal. 2020;63:101712.
Elazab A, et al. GP-GAN: Brain tumor growth prediction using stacked 3D generative adversarial networks from longitudinal MR Images. Neural Netw. 2020;132:321–32.
Wei W, et al. Predicting PET-derived demyelination from multimodal MRI using sketcher-refiner adversarial training for multiple sclerosis. Med Image Anal. 2019;58: 101546.
Zhao Y, et al. Prediction of Alzheimer's disease progression with multi-information generative adversarial network. IEEE J Biomed Health Inf. 2020;25(3):711–9.
Tang Y, et al. A disentangled generative model for disease decomposition in chest X-rays via normal image synthesis. Med Image Anal. 2020;67: 101839.
Sun L, et al. An adversarial learning approach to medical image synthesis for lesion detection. IEEE J Biomed Health Inf. 2020;24(8):2303–14.
**a T, Chartsias A, Tsaftaris SA. Pseudo-healthy synthesis with pathology disentanglement and adversarial learning. Med Image Anal. 2020;64: 101719.
Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: International conference on information processing in medical imaging. Springer; 2017. pp. 146–157.
Chen X, Konukoglu E. Unsupervised detection of lesions in brain MRI using constrained adversarial auto-encoders; 2018. ar**v preprint ar**v:1806.04972.
Randulfe JL et al. A quantitative method for selecting denoising filters, based on a new edge-sensitive metric. In: Proceedings of 2017 IEEE International Conference on Industrial Technology, ICIT2017, pp. 974–979
Yin S, Rodriguez J, Jiang Y. Real-time monitoring and control of industrial cyberphysical systems with integrated plant-wide monitoring and control framework. IEEE Ind Electron Mag. 2019;13(4):38–47.
Jiang Y, Yin S, Li K, Luo H, Kaynak O. Industrial applications of digital twins. Phil Trans R Soc A. 2021;379:20200360.
Cohen JP, Luck M, Honari S. Distribution matching losses can hallucinate features in medical image translation. ar**v preprint ar**v:1805.08841.
Author information
Authors and Affiliations
Contributions
Conceptualization, XL, SY, JR; resources: OK, SY, and HL; investigation, XL and YJ; writing—original draft preparation, XL and YJ; writing—review and editing, JR, OK, and YJ; supervision, OK, SY, and HL. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Li, X., Jiang, Y., Rodriguez-Andina, J.J. et al. When medical images meet generative adversarial network: recent development and research opportunities. Discov Artif Intell 1, 5 (2021). https://doi.org/10.1007/s44163-021-00006-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s44163-021-00006-0