A Data-Efficient Deep Learning Framework for Segmentation and Classification of Histopathology Images

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 Workshops (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13803))

Included in the following conference series:

Abstract

The current study of cell architecture of inflammation in histopathology images commonly performed for diagnosis and research purposes excludes a lot of information available on the biopsy slide. In autoimmune diseases, major outstanding research questions remain regarding which cell types participate in inflammation at the tissue level, and how they interact with each other. While these questions can be partially answered using traditional methods, artificial intelligence approaches for segmentation and classification provide a much more efficient method to understand the architecture of inflammation in autoimmune disease, holding a great promise for novel insights. In this paper, we empirically develop deep learning approaches that uses dermatomyositis biopsies of human tissue to detect and identify inflammatory cells. Our approach improves classification performance by 26% and segmentation performance by 5%. We also propose a novel post-processing autoencoder architecture that improves segmentation performance by an additional 3%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 139.09
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 181.89
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agarwal, V., Jhalani, H., Singh, P., Dixit, R.: Classification of melanoma using efficient nets with multiple ensembles and metadata. In: Tiwari, R., Mishra, A., Yadav, N., Pavone, M. (eds.) Proceedings of International Conference on Computational Intelligence. AIS, pp. 101–111. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-3802-2_8

    Chapter  Google Scholar 

  2. Brock, A., De, S., Smith, S.L., Simonyan, K.: High-performance large-scale image recognition without normalization. In: International Conference on Machine Learning, pp. 1059–1071. PMLR (2021)

    Google Scholar 

  3. Dash, M., Londhe, N.D., Ghosh, S., Semwal, A., Sonawane, R.S.: Pslsnet: Automated psoriasis skin lesion segmentation using modified u-net-based fully convolutional network. Biomed. Signal Process. Contr. 52 226–237 (2019). https://doi.org/10.1016/j.bspc.2019.04.002, https://www.sciencedirect.com/science/article/pii/S1746809419300990

  4. Dinse, G.E., et al.: Increasing prevalence of antinuclear antibodies in the united states. Arthritis Rheumatol. 72(6), 1026–1035 (2020)

    Article  Google Scholar 

  5. Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. ar**v preprint ar**v:2010.11929 (2020)

  6. Ehrenfeld, M., et al.: Covid-19 and autoimmunity. Autoimmun. Rev. 19(8), 102597 (2020)

    Article  Google Scholar 

  7. Falcon, W., et al.: Pytorch lightning. GitHub. Note: https://github.com/PyTorchLightning/pytorch-lightning vol. 3(6) (2019)

  8. Galeotti, C., Bayry, J.: Autoimmune and inflammatory diseases following covid-19. Nat. Rev. Rheumatol. 16(8), 413–414 (2020)

    Article  Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learningfor image recognition. In: ComputerScience (2015)

    Google Scholar 

  10. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141 (2018)

    Google Scholar 

  11. Izmailov, P., Podoprikhin, D., Garipov, T., Vetrov, D., Wilson, A.G.: Averaging weights leads to wider optima and better generalization. ar**v preprint ar**v:1803.05407 (2018)

  12. Jacobson, D.L., Gange, S.J., Rose, N.R., Graham, N.M.: Epidemiology and estimated population burden of selected autoimmune diseases in the united states. Clin. Immunol. Immunopathol. 84(3), 223–243 (1997)

    Article  Google Scholar 

  13. Lerner, A., Jeremias, P., Matthias, T.: The world incidence and prevalence of autoimmune diseases is increasing. Int. J. Celiac Disease 3(4), 151–155 (2015). 10.12691/ijcd-3-4-8, http://pubs.sciepub.com/ijcd/3/4/8

  14. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  15. Liu, Z., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  16. Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., **e, S.: A convnet for the 2020s. ar**v preprint ar**v:2201.03545 (2022)

  17. Picard, D.: Torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision. CoRR abs/2109.08203 (2021). https://arxiv.org/abs/2109.08203

  18. Raghu, M., Zhang, C., Kleinberg, J., Bengio, S.: Transfusion: Understanding transfer learning for medical imaging. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  19. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  20. Stafford, I., Kellermann, M., Mossotto, E., Beattie, R., MacArthur, B., Ennis, S.: A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases. NPJ Digital Med. 3(1), 1–11 (2020)

    Article  Google Scholar 

  21. Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)

    Google Scholar 

  22. Tsakalidou, V.N., Mitsou, P., Papakostas, G.A.: Computer vision in autoimmune diseases diagnosis—current status and perspectives. In: Smys, S., Tavares, J.M.R.S., Balas, V.E. (eds.) Computational Vision and Bio-Inspired Computing. AISC, vol. 1420, pp. 571–586. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-9573-5_41

    Chapter  Google Scholar 

  23. Buren, V., et al.: Artificial intelligence and deep learning to map immune cell types in inflamed human tissue. Journal of Immunological Methods 505, 113233 (2022). https://doi.org/10.1016/j.jim.2022.113233, https://www.sciencedirect.com/science/article/pii/S0022175922000205

  24. Wightman, R.: Pytorch image models. https://github.com/rwightman/pytorch-image-models (2019). https://doi.org/10.5281/zenodo.4414861

  25. **e, C., Tan, M., Gong, B., Wang, J., Yuille, A.L., Le, Q.V.: Adversarial examples improve image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 819–828 (2020)

    Google Scholar 

  26. **e, Q., Luong, M.T., Hovy, E., Le, Q.V.: Self-training with noisy student improves imagenet classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10687–10698 (2020)

    Google Scholar 

  27. Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: a nested u-net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1

    Chapter  Google Scholar 

Download references

Acknowledgment

We would like to thank NYU HPC team for assisting us with our computational needs. We would also like to thank Prof. Elena Sizikova (Moore Sloan Faculty Fellow, Center for Data Science (CDS), New York University (NYU)) for her valuable feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pranav Singh .

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Expansion of Results

In Tables 6 and 7 we show complete results with mean and standard deviation. These are an expansion of Table 2 in Sect. 5.1 of the main paper. Tables were compressed to save space and only focus on the main results. To provide a complete picture, we added extended results in this section.

Table 6. This table shows the IoU score on the test set for UNet. We compared results without and with autoencoder for both ReLU and GELU activations for UNet Architecture. These results are averaged over five runs with different seed values. We observed that in all cases addition of APP improved performance. GELU activated APP seems out perform the ReLU activated APP in all cases except for ResNet-50.
Table 7. This table shows the IoU score on the test set for UNet++. These results are averaged over five runs with different seed values. We compare results without and with autoencoder for both ReLU and GELU activations for UNet++ Architecture. We observed that in most cases, APP improves performance except for UNet++ with Resnet-18, where APP segmentation techniques lag by around 5%. However, as a counter for ResNet-34 APP-based segmentation techniques are almost 10% better than UNet++ without APP.

1.2 A.2 Autoencoder with Efficientnet Encoder for Segmentation

In Tables 8 and 9 we compared the time taken to train and the performance of the respective trained architecture for segmentation using EfficientNet encoders. We observed that with the addition of autoencoder post-processing, training time increased by an average of 3 m 7.3 s over 50 epochs (averaged over the entire efficientnet family). This is an increase of 2.93% in training time over the eight encoders. In other words, an average increase of 0.36% increase in time per encoder over 50 epochs.

Performance wise architecture with autoencoder post-processing consistently outperformed segmentation architectures without them by 2.75%.

Table 8. In this table we report the running time averaged over 5 runs with different seeds, for efficient-net encoder family with UNet.The variation is almost negligible(\(<6s\)).
Table 9. In this table we report the IoU averaged over 5 runs with different seeds, for efficient-net encoder family with UNet architecture.
Table 10. In this table we report the running time averaged over 5 runs with different seeds, for efficient-net encoder family with UNet++.

Similarly, we compared computational and performance for UNet++ with and without the autoencoder post-processing in Tables 10 and 11 respectively. In this case, we observed that the gain in performance with autoencoder post-processing is 5% averaged over the efficientnet family of encoders. This also corresponds to a 3 m 7 s increase in training time which is an increase of 2.6%.

Table 11. In this table we report the IoU averaged over 5 runs with different seeds, for efficient-net encoder family with UNet++ architecture.

1.3 A.3 Metrics Description

For measuring segmentation performance, we use IoU or intersection over union metric. It helps us understand how similar sample sets are.

figure a

Here the comparison is made between the output mask by segmentation pipeline against the ground truth mask.

For measuring classification performance, we use the F1 score.

Computed as F1 = \(\frac{\text {2*Precision*Recall}}{\text {Precision+Recall}} = \frac{\text {2*TP}}{\text {2*TP+FP+FN}}\)

1.4 A.4 Effect of Different Weights

ImageNet initialization has been the defacto norm for most transfer learning tasks. Although in some cases, as in [1] it was observed that noisy student weights performed better than ImageNet initialization. To study the effect in our case, we used advprop and noisy student initialization. ImageNet weights for initialization work for medical data not because of feature reuse but because of better weight scaling and faster convergence [18]. Noisy student training [26] extends the idea of self-training and distillation with the use of equal-or-larger student models, and noise such as dropout, stochastic depth, and data augmentation via RandAugment is added to the student during learning so that the student generalizes better than the teacher. First, an EfficientNet model is trained on labelled images and is used as a teacher to generate pseudo labels for 300M unlabeled images. We then train a larger EfficientNet as a student model on the combination of labelled and pseudo-labelled images. This helps reduce the error rate, increases robustness and improves performance over the existing state-of-the-art on ImageNet.

(ii)AdvProp training, which banks on Adversarial examples, which are commonly viewed as a threat to ConvNets. In [25] they present an opposite perspective: adversarial examples can be used to improve image recognition models. They treat adversarial examples as additional examples to prevent overfitting. It performs better when the models are bigger. This improves upon performance for various ImageNet and its’ subset benchmarks.

Since initially all these were developed for the EfficientNet family of the encoders, we used them for benchmarking. We present their results in Table 12.

Table 12. Using different initialization, we saw that the performance of different encoders of the EfficientNet family on UNet. We report the IoU over the test set in the following table. We observe that while performance gains for smaller models, ImageNet initialisation works better for larger models. Also, the fact that advprop and noisy are not readily available for all models, hence the choice of ImageNet still dominates.
Table 13. We report the F1 score for different initializations for the EfficientNet family of encoders. We reported the average of 6-fold runs on the test set with five different seed values. We observed that 0.8463 is the peak with ImageNet, 0.843 with advprop and 0.8457 with noisy student initialization.

Similarly, we conduct similar experiments for classification with different initialization. We reported these results in Table 13.

As we can see for segmentation, ImageNet initialization performed better in most cases. Similarly, in classification, it not only performed better in most cases but also provided the best overall result—these inferences, combined with the fact that advprop and noisy student requires additional computational resources. Hence we decide to stick with ImageNet initialization.

1.5 A.5 Expansion on Experimental Details

Segmentation. We used PyTorch lightning’s [7] seed everything functionality to seed all the generator values uniformly. For setting the seed value, we randomly generated a set of numbers in the range of 1 and 1000. We did not perform an extensive search of space to optimise performance with seed value as suggested in [17]. We used seed values 26, 77, 334, 517 and 994. For augmentation, we used conversion to PIL Image to apply random rotation (degrees=3), random vertical and horizontal flip, then conversion to tensor and finally channel normalisation. We could have used a resize function to reshape the 1408 by 1876 Whole Slide Images (WSI), but we instead tilled them in 480 square tile images. We then split them into a batch size of 16 before finally passing through the segmentation architecture (UNet/UNet++). We used channel attention only decoder, with ImageNet initialisation and a decoder depth of 3 (256, 128, 64).

We used cross-entropy loss with dark/light pixel normalization, Adam optimizer with LR set to 3.6e-04 and weight decay of 1e-05. We used a cosine scheduling rate with a minimum value set to 3.4e-04.

APP Segmentation. When using APP we used GELU activation by default with adam optimizer and lr set to 1e-3.

Classification. For Classification, we used the same seed values with PyTorch lightning’s [7] seed everything functionality, as described for segmentation above. For augmentation, we resized the images to 384 square images, followed by randomly applying colour jitter (0.2, 0.2, 0.2) or random perspective (distortion scale=0.2) with probability 0.3, colour jittering (0.2, 0.2, 0.2) or random affine (degrees=10) with probability 0.3, random vertical flip and random horizontal flip with probability 0.3 and finally channel normalization.

We used Stochastic weigh averaging with adam optimizer. We used a cosine learning rate starting at 1e-3 and a minimum set to 1e-6. We used focal loss with normalized class weight as our loss function. We used 6-fold validation with each fold of 20 epochs and batch size of 16. We used same parameters for both CNN and Transformers.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Singh, P., Cirrone, J. (2023). A Data-Efficient Deep Learning Framework for Segmentation and Classification of Histopathology Images. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13803. Springer, Cham. https://doi.org/10.1007/978-3-031-25066-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25066-8_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25065-1

  • Online ISBN: 978-3-031-25066-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation