Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning

Chen, Mingyu; Zhang, Bin; Topatana, Win; Cao, Jiasheng; Zhu, Hepan; Juengpanich, Sarun; Mao, Qijiang; Yu, Hong; Cai, **ujun

doi:10.1038/s41698-020-0120-3

Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning

Article
Open access
Published: 08 June 2020

Volume 4, article number 14, (2020)
Cite this article

Download PDF

You have full access to this open access article

npj Precision Oncology

Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning

Download PDF

17k Accesses
124 Citations
3 Altmetric
Explore all metrics

Abstract

Hepatocellular carcinoma (HCC) is the most common subtype of liver cancer, and assessing its histopathological grade requires visual inspection by an experienced pathologist. In this study, the histopathological H&E images from the Genomic Data Commons Databases were used to train a neural network (inception V3) for automatic classification. According to the evaluation of our model by the Matthews correlation coefficient, the performance level was close to the ability of a 5-year experience pathologist, with 96.0% accuracy for benign and malignant classification, and 89.6% accuracy for well, moderate, and poor tumor differentiation. Furthermore, the model was trained to predict the ten most common and prognostic mutated genes in HCC. We found that four of them, including CTNNB1, FMN2, TP53, and ZFX4, could be predicted from histopathology images, with external AUCs from 0.71 to 0.89. The findings demonstrated that convolutional neural networks could be used to assist pathologists in the classification and detection of gene mutation in liver cancer.

Deep learning for prediction of hepatocellular carcinoma recurrence after resection or liver transplantation: a discovery and validation study

Article Open access 29 March 2022

Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning

Article 17 September 2018

CancerNet: a unified deep learning network for pan-cancer diagnostics

Article Open access 13 June 2022

Introduction

Hepatocellular carcinoma (HCC) is the fourth leading cause of cancer-related mortality and is currently the main cause of liver-related death, leading to more than one million deaths annually worldwide^1,2,3. Over several decades, substantial progress had been made in the understanding of HCC risk factors, epidemiology, and molecular pathogenesis. The early detection of HCC increases the chance of curative therapies in high overall survival. Unfortunately, most HCC patients are diagnosed at the intermediate to late-stage, which significantly decreases the overall survival⁴. Various predominant clinical risk factors for the development of HCC have been defined, including alcohol abuse, cirrhosis, metabolic syndrome, and hepatitis B and/or C virus infection^5,6,7,8. However, multiple genetic alternation and signaling cascades also have a great influence on tumor progression and overall survival⁹.

The understanding of HCC molecular pathogenesis has been significantly improved over the past decade¹⁰. The development of genomic analysis has identified the major drivers that are responsible for cancer development and progression. HCC has been reported to have around 40 genomic aberrations, some of which are deemed as drivers. Several frequent HCC genomic alternations have been identified, including mutations in the CTNNB1 (β-catenin WNT pathway activation), TP53, telomere reverse transcriptase (telomere maintenance), AT-rich interaction domain 1A (ARID1A; chromatin remodeling), mammalian target of rapamycin signaling, RAS signaling, oxidative stress pathway activation, and aberrations in DNA methylation¹¹. Previous studies have reported that the heterogeneity of HCC at both molecular and histological levels are correlated with gene mutations and oncogenic pathways¹². The mutually exclusive CTNNB1 (40%) and TP53 (21%) mutations have been identified as two major groups of HCC according to its distinct phenotype. CTNNB1 mutated HCC is generally well-differentiated and large, with pseudoglandular and microtrabecular patterns, and lacks inflammatory infiltrates; whereas TP53 mutated HCC is generally poor-differentiated, with compact patterns, frequent vascular invasion, and pleomorphic, multinucleated cells¹³. The deeper understandings of the HCC phenotypes are essential for improving targeted therapies and clinical translation.

Pathologists could provide limited information regarding cancer reorganization from normal liver tissue and assess its histopathological grade via visual inspection, but it still lacks the underlying biological differences in HCC gene mutations associated with overall survival. The recent advances in artificial intelligence (AI) provided a novel way to assist clinicians to classify medical information and images^{14,15,4). Finally, the liver cancer tiles dataset consisted of four subsets, including the training, testing, internal validation, and external validation sets. The data in the training and internal validation cohorts from the Genomic Data Commons portal (https://portal.gdc.cancer.gov/) were publicly available without restriction, authentication or authorization. The independent external validation cohort we used consisted of slide images without identifiable information and all participants had provided written informed consent. Our study was approved by the SRRSH of Medicine Institutional Review Board (KY20181209-5).}

**Fig. 4: Strategy of preparing tiles dataset.**

Technical detail on frozen slides in the external validation cohort

The obtained specimens (e.g., liver tissues) were macroscopically examined, measured, sectioned through their longest axis, and then midsections were examined. The material was frozen at −28 °C, cut into 5–10 µm thick sections, Hematoxylin-Eosin (H&E) stained, and then analysed by pathologists with the light microscope. There were 67 out of 70 patients diagnosed as HCC and the related frozen slide were collected. Notably, normal liver tissues cannot be available in half of the obtained specimens, because normal liver tissues should be at least 2 cm away from tumors. Therefore, there were only 34 WSIs of normal liver tissues. In order to obtain digital pathology images, each slide was scanned at a magnification of 20× by using digital pathology scanner VS120 (Olympus).

Deep-learning with convolution neural networks

Typical convolutional neural networks contain several levels of convolution filters, pooling layers, and fully connected layers. In our study, we primarily used inception V3 architecture, which makes use of inception modules which are made from a spread of convolutions having different kernel sizes and a max-pooling layer. The initial five convolution nodes are combined with two max-pooling operations and followed by 11 stacks of inception modules. A fully connected layer to the end of the inception modules was then added to permit us to utilize the pre-trained model and finetune the parameters for our own task. Finally, a softmax layer was added as a classifier outputting a probability for every class, and the one with the highest probability was chosen as the predicted class.

We used the pre-trained model offered by TensorFlow and finetuned it using histopathological images. It was pre-trained on the ImageNet dataset and available at the TensorFlow-Slim image classification library (http://tensorflow.org). We initialized the parameters from the pre-trained model because pre-training can speed up the convergence of the network. Most importantly, it was difficult to train a deep network with a small number of images due to the massive number of network parameters.

Comparison with pathologists

One hundred and one WSIs of liver tissues without a label from the external validation cohort were used to test pathologist’s performance and compared with our model performance. All pathologists should report whether there is HCC, and if there is HCC, they should report histopathological grade via digital pathology images. The outcomes reported by six pathologists with 2-years, 5-years, and 10-years experience (two pathologists in each category) and our model were collected and analyzed by the R 3.6.0 (https://www.r-project.org). Cohen’s Kappa analysis was performed to assess inter-observer agreement. Good inter-operator agreements were observed in pathologists with 2-year experience (Kappa = 0.894; 95% CI, 0.837–0.944), pathologists with 5-year experience (Kappa = 0.933; 95% CI, 0.888–0.975), and pathologists with 5-year experience (Kappa = 0.967; 95% CI, 0.930–0.992).

Identification of significantly mutated genes

The gene mutation data for the matched patient sample were downloaded from the cancer genome atlas (TCGA). The gene mutated at least 10% of the available liver cancer samples were selected from the 283 cancer-related genes (Supplementary Fig. 2). The least absolute shrinkage and selection operator (LASSO) regression with a 10-fold cross-validation method was then performed to identify significant prognosis-related gene mutations by using R software packages (http://www.r-project.org). Finally, the ten most significant prognosis-related gene mutations, including ARID1A, ASH1L, CSMD1, CTNNB1, EYS, FMN2, MDM4, RB1, TP53, and ZFX4 were identified (Fig. 5).

**Fig. 5: Prognosis-related mutated genes selection using the least absolute shrinkage and selection operator (LASSO) Cox regression model.**

Training deep-learning network

Pathological diagnosis was the primary endpoint of interest for the classifier that recognizes tumors from normal liver tissue and the assessment of the histopathological grade. The status of gene mutation (mutation or wild type), based on the next-generation sequencing results, was the primary prerequisite in the classifier of mutation prediction. The model’s training strategy was based on an easy-to-use platform called EASY DL (https://ai.baidu.com/easydl/) that uses PaddlePaddle deep learning framework V3.0 created by Baidu Brain AI technology, inception V3 network developed by Google, and packaging code form Coudray²⁰ and co-workers. The training set was used for training, and the testing set was used to evaluate the performances, finetune those parameters, and improve the models. A final model was selected according to the results of the testing set, where the F1-scores as a stop** rule. Notably, the subsets were grouped based on HCC patients rather than the WSIs. This method could maximize the size of the training set and avoid training and testing on tiles originating from the same human subjects. Thereby preventing the classifier from relying on intra-subject correlations between samples and resulting in inflated estimates of accuracy. In order to reduce selection bias, the performance of our model was then validated in the internal and external validation sets.

Statistical analysis

The ten most common and prognostic mutated genes were identified using the LASSO Cox regression model, and any differences of overall survival were evaluated by the Kaplan–Meier method with a log-rank test. The performance of those models was evaluated with F1-scores, MCC, and AUC. The F1-scores, ranging from 1 (perfect) to 0 (bad), is the harmonic average of the precision and recall²¹. MCC ranges from 1 (perfect) to −1 (bad). In addition, the probability of gene mutation was estimated and compared using the two-tailed Mann–Whitney U-tests. A P value of less than 0.05, was considered as statistical significance.

Data availability

The slide images and the corresponding cancer information were uploaded from the Genomic Data Commons portal (https://portal.gdc.cancer.gov/) and were in whole or in part based upon data generated by the TCGA Research Network (http://cancergenome.nih.gov/). These data were publicly available without restriction, authentication, or authorization. The datasets for the independent cohorts generated and/or analyzed during the current study are available from the corresponding author (X.J.C.) upon reasonable request and through collaborative investigations.

Code availability

The codes that were used to train and validate the deep-learning model in the manuscript are available at https://github.com/drmaxchen-gbc/HCC-deep-learning. It also used other open-source codes (inception V3), which were available at https://github.com/openslide/openslide-python.

References

Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2019. Cancer J. Clin. 69, 7–34 (2019).
Article Google Scholar
Miller, K. D. et al. Cancer statistics for Hispanics/Latinos, 2018. Cancer J. Clin. 68, 425–445 (2018).
Article Google Scholar
Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Cancer J. Clin. 68, 394–424 (2018).
Article Google Scholar
Kudo, M. et al. Brivanib as adjuvant therapy to transarterial chemoembolization in patients with hepatocellular carcinoma: a randomized phase III trial. Hepatology 60, 1697–1707 (2014).
Article CAS Google Scholar
Sayiner, M., Golabi, P. & Younossi, Z. M. Disease burden of hepatocellular carcinoma: a global perspective. Dig. Dis. Sci. https://doi.org/10.1007/s10620-019-05537-2 (2019).
Chaturvedi, V. K. et al. Molecular mechanistic insight of hepatitis B virus mediated hepatocellular carcinoma. Microb. Pathog. 128, 184–194 (2019).
Article CAS Google Scholar
Torres, H. A. et al. The oncologic burden of hepatitis C virus infection: a clinical perspective. Cancer J. Clin. 67, 411–431 (2017).
Article Google Scholar
Vandenbulcke, H. et al. Alcohol intake increases the risk of HCC in hepatitis C virus-related compensated cirrhosis: a prospective study. J. Hepatol. 65, 543–551 (2016).
Article CAS Google Scholar
Rao, C. V., Asch, A. S. & Yamada, H. Y. Frequently mutated genes/pathways and genomic instability as prevention targets in liver cancer. Carcinogenesis 38, 2–11 (2017).
Article CAS Google Scholar
Juengpanich, S. et al. Role of cellular, molecular, and tumor microenvironment in hepatocellular carcinoma: possible targets and future directions in the Regorafenib Era. Int. J. Cancer. https://doi.org/10.1002/ijc.32970 (2020).
Zucman-Rossi, J., Villanueva, A., Nault, J. C. & Llovet, J. M. Genetic landscape and biomarkers of hepatocellular carcinoma. Gastroenterology 149, 1226–1239 (2015).
Article CAS Google Scholar
Nault, J. C. & Villanueva, A. Intratumor molecular and phenotypic diversity in hepatocellular carcinoma. Clin. Cancer Res. 21, 1786–1788 (2015).
Article CAS Google Scholar
Calderaro, J. et al. Histological subtypes of hepatocellular carcinoma are related to gene mutations and molecular tumour classification. J. Hepatol. 67, 727–738 (2017).
Article CAS Google Scholar
Zhou, Q. et al. Grading of hepatocellular carcinoma using 3D SE-DenseNet in dynamic enhanced MR images. Comput. Biol. Med. 107, 47–57 (2019).
Article Google Scholar
Weston, A. D. et al. Automated abdominal segmentation of CT scans for body composition analysis using deep learning. Radiology 290, 669–679 (2019).
Article Google Scholar
Yi, F., Huang, J., Yang, L., **e, Y. & **ao, G. Automatic extraction of cell nuclei from H&E-stained histopathological images. J. Med. Imaging 4, 027502 (2017).
Article Google Scholar
**ng, F., **e, Y. & Yang, L. An automatic learning-based framework for robust nucleus segmentation. IEEE Trans. Med. Imaging 35, 550–566 (2016).
Article Google Scholar
Lin, H. et al. Automated classification of hepatocellular carcinoma differentiation using multiphoton microscopy and deep learning. J. Biophoton. https://doi.org/10.1002/jbio.201800435 (2019).
Li, S., Jiang, H. & Pang, W. Joint multiple fully connected convolutional neural network with extreme learning machine for hepatocellular carcinoma nuclei grading. Comput. Biol. Med. 84, 156–167 (2017).
Article Google Scholar
Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
Article CAS Google Scholar
Darcy, A. M., Louie, A. K. & Roberts, L. W. Machine learning and the profession of medicine. JAMA 315, 551–552 (2016).
Article CAS Google Scholar
Skrede, O. J. et al. Deep learning for prediction of colorectal cancer outcome: a discovery and validation study. Lancet 395, 350–360 (2020).
Article CAS Google Scholar
Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25, 1054–1056 (2019).
Article CAS Google Scholar
Ehteshami Bejnordi, B. et al. Using deep convolutional neural networks to identify and classify tumor-associated stroma in diagnostic breast biopsies. Mod. Pathol. 31, 1502–1512 (2018).
Article Google Scholar
Bera, K., Schalper, K. A., Rimm, D. L., Velcheti, V. & Madabhushi, A. Artificial intelligence in digital pathology—new tools for diagnosis and precision oncology. Nat. Rev. Clin. Oncol. 16, 703–715 (2019).
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818–2826 (2015).
Agarwal, R., Narayan, J., Bhattacharyya, A., Saraswat, M. & Tomar, A. K. Gene expression profiling, pathway analysis and subtype classification reveal molecular heterogeneity in hepatocellular carcinoma and suggest subtype specific therapeutic targets. Cancer Genet 216–217, 37–51 (2017).
Article Google Scholar
Zaman, G. J. R. et al. TTK inhibitors as a targeted therapy for CTNNB1 (beta-catenin) mutant cancers. Mol. Cancer Ther. 16, 2609–2617 (2017).
Article CAS Google Scholar
Liu, X., Liao, W., Yuan, Q., Ou, Y. & Huang, J. TTK activates Akt and promotes proliferation and migration of hepatocellular carcinoma cells. Oncotarget 6, 34309–34320 (2015).
Article Google Scholar
Liang, X. D. et al. Expression and function analysis of mitotic checkpoint genes identifies TTK as a potential therapeutic target for human hepatocellular carcinoma. PLoS ONE 9, e97739 (2014).
Article Google Scholar
Dietz, R. L. & Pantanowitz, L. The future of anatomic pathology: deus ex machina? J. Med. Artif. Intell. 2, 4 (2019).
Tizhoosh, H. R. & Pantanowitz, L. Artificial intelligence and digital pathology: challenges and opportunities. J. Pathol. Inf. 9, 38 (2018).
Article Google Scholar
Maddox, T. M., Rumsfeld, J. S. & Payne, P. R. O. Questions for artificial intelligence in health care. JAMA 321, 31–32 (2019).
Article Google Scholar
Stead, W. W. Clinical implications and challenges of artificial intelligence and deep learning. JAMA 320, 1107–1108 (2018).
Article Google Scholar

Download references

Acknowledgements

We would like to thank the EASY DL team and Hangzhou **xuan Health technology Co., Ltd. for their assistance in training our models. Thanks to Y.C., J.H.H., S.J.L., F.Y. and all our colleagues for their assistance in this study. This abstract of the study was presented at The International Liver Congress ^TM 2019 (EASL 2019) as Late-Breaker poster, in Vienna, Austria, on April 11–13, 2019. This work was supported by the Opening Fund of Engineering Research Center of Cognitive Healthcare of Zhejiang Province (No.2018KFJJ09), Zhejiang Medical Health Science and Technology Project (No.2016133597), and National Natural Science Foundation of China (No.81827804).

Author information

Authors and Affiliations

Department of General Surgery, Sir Run-Run Shaw Hospital, Zhejiang University, 310016, Hangzhou, China
Mingyu Chen, Bin Zhang, Jiasheng Cao, Hepan Zhu, Qijiang Mao, Hong Yu & **ujun Cai
Key Laboratory of Endoscopic Technique Research of Zhejiang Province, Sir Run-Run Shaw Hospital, Zhejiang University, 310016, Hangzhou, China
Mingyu Chen & **ujun Cai
Engineering Research Center of Cognitive Healthcare of Zhejiang Province, 310003, Hangzhou, China
Mingyu Chen & **ujun Cai
Zhejiang University School of Medicine, 310000, Hangzhou, China
Win Topatana & Sarun Juengpanich

Authors

Mingyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Bin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Win Topatana
View author publications
You can also search for this author in PubMed Google Scholar
Jiasheng Cao
View author publications
You can also search for this author in PubMed Google Scholar
Hepan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Sarun Juengpanich
View author publications
You can also search for this author in PubMed Google Scholar
Qijiang Mao
View author publications
You can also search for this author in PubMed Google Scholar
Hong Yu
View author publications
You can also search for this author in PubMed Google Scholar
**ujun Cai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.Y.C., J.S.C., W.T., H.Y., and B.Z. were involved in the study design, data collection and analysis, and drafted the paper; H.P.Z., S.J., and Q.J.M. collected and checked data; M.Y.C., J.S.C., X.J.C., and W.T. revised the paper; X.J.C. designed, supervised the study; and all authors wrote the paper.

Corresponding authors

Correspondence to Hong Yu or **ujun Cai.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Figures

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, M., Zhang, B., Topatana, W. et al. Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning. npj Precis. Onc. 4, 14 (2020). https://doi.org/10.1038/s41698-020-0120-3

Download citation

Received: 17 February 2020
Accepted: 07 May 2020
Published: 08 June 2020
DOI: https://doi.org/10.1038/s41698-020-0120-3
Springer Nature Limited

This article is cited by

Unified deep learning models for enhanced lung cancer prediction with ResNet-50–101 and EfficientNet-B3 using DICOM images
- Vinod Kumar
- Chander Prabha
- Mohamed Abouhawwash
BMC Medical Imaging (2024)
Evolution of LiverNet 2.x: Architectures for automated liver cancer grade classification from H&E stained liver histopathological images
- Amit Kumar Chanchal
- Shyam Lal
- Jyoti Kini
Multimedia Tools and Applications (2024)
A deep-learning framework to predict cancer treatment response from histopathology images through imputed transcriptomics
- Danh-Tai Hoang
- Gal Dinstag
- Eytan Ruppin
Nature Cancer (2024)
Biased data, biased AI: deep networks predict the acquisition site of TCGA images
- Taher Dehkharghanian
- Azam Asilian Bidgoli
- Shahryar Rahnamayan
Diagnostic Pathology (2023)
Preliminary evaluation of deep learning for first-line diagnostic prediction of tumor mutational status
- Louis-Oscar Morel
- Valentin Derangère
- Nathan Vinçon
Scientific Reports (2023)

Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning

Abstract

Similar content being viewed by others

Deep learning for prediction of hepatocellular carcinoma recurrence after resection or liver transplantation: a discovery and validation study

Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning

CancerNet: a unified deep learning network for pan-cancer diagnostics

Introduction

Technical detail on frozen slides in the external validation cohort

Deep-learning with convolution neural networks

Comparison with pathologists

Identification of significantly mutated genes

Training deep-learning network

Statistical analysis

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Figures

Rights and permissions

About this article

Cite this article

This article is cited by

Unified deep learning models for enhanced lung cancer prediction with ResNet-50–101 and EfficientNet-B3 using DICOM images

Evolution of LiverNet 2.x: Architectures for automated liver cancer grade classification from H&E stained liver histopathological images

A deep-learning framework to predict cancer treatment response from histopathology images through imputed transcriptomics

Biased data, biased AI: deep networks predict the acquisition site of TCGA images

Preliminary evaluation of deep learning for first-line diagnostic prediction of tumor mutational status

Navigation

Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning

Abstract

Similar content being viewed by others

Introduction

Technical detail on frozen slides in the external validation cohort

Deep-learning with convolution neural networks

Comparison with pathologists

Identification of significantly mutated genes

Training deep-learning network

Statistical analysis

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Navigation