Automated Age-Related Macular Degeneration Diagnosis in Retinal Fundus Images via ViT

Vannadil, Niranjana; Kokil, Priyanka

doi:10.1007/978-3-031-47942-7_24

Niranjana Vannadil⁹ &
Priyanka Kokil⁹

Part of the book series: Signals and Communication Technology ((SCT))

Included in the following conference series:

International Conference on Machine Learning, Deep Learning and Computational Intelligence for Wireless Communication

65 Accesses

Abstract

Age-related Macular Degeneration (AMD) is a visual impairment condition that commonly affects elderly individuals, and it is a top cause of vision loss in individuals over the age of 60. Early detection can help identify a treatment plan to slow the progress of the disease and prevent severe loss of vision, but this requires comprehensive eye examinations and accurate diagnosis of the disease. The diagnosis is subject to human error. This chapter proposes the use of an automated detection system that circumvents these issues. Multiple deep learning techniques are explored in order to identify an optimum algorithm for automated detection. Deep learning algorithms, particularly Convolution Neural Networks (CNNs), are the front-runners for medical image classification and detection problems. While CNN models are accurate, they are also computationally intensive. This chapter explores the possibility of a lightweight deep learning architecture namely Vision Transformer (ViT) and its variants to diagnose AMD. A comparison is done between state-of-the-art CNN models including AlexNet, MobileNet, and XCeption model and the transformer models including ViT, Modified ViT, and Swin Transformer. The comparison is done on the basis of accuracy, average training time (ATT), CPU utilization, and GPU utilization in order to identify models that can be used for mobile applications with good accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (Canada)

eBook: USD 229.00; Price excludes VAT (Canada)

Hardcover Book: USD 299.99; Price excludes VAT (Canada)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Apostolidis, K.D., Papakostas, G.A.: A survey on adversarial deep learning robustness in medical image analysis. Electronics 10(17), 2132 (2021)
Article Google Scholar
Chollet, F.: XCeption: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Google Scholar
Dai, Y., Gao, Y., Liu, F.: TransMed: transformers advance multi-modal medical image classification. Diagnostics 11(8), 1384 (2021)
Article Google Scholar
Dai, Z., Liu, H., Le, Q.V., Tan, M.: CoAtNet: marrying convolution and attention for all data sizes. Adv. Neural Informat. Process. Syst. 34, 3965–3977 (2021)
Google Scholar
De Jong, P.T.: Age-related macular degeneration. New England J. Med. 355(14), 1474–1485 (2006)
Article Google Scholar
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale (2020). Preprint ar**v:2010.11929
Google Scholar
Gheflati, B., Rivaz, H.: Vision transformer for classification of breast ultrasound images (2021). Preprint ar**v:2110.14731
Google Scholar
He, K., Gan, C., Li, Z., Rekik, I., Yin, Z., Ji, W., Gao, Y., Wang, Q., Zhang, J., Shen, D.: Transformers in medical image analysis: A review (2022). Preprint ar**v:2202.12165
Google Scholar
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient convolutional neural networks for mobile vision applications (2017). Preprint ar**v:1704.04861
Google Scholar
Karimi, D., Vasylechko, S.D., Gholipour, A.: Convolution-free medical image segmentation using transformers. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 78–88. Springer, Berlin (2021)
Google Scholar
Kokil, P., Pratap, T.: Additive white gaussian noise level estimation for natural images using linear scale-space features. Circuits Syst. Signal Process. 40(1), 353–374 (2021)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Google Scholar
Lee, S.H., Lee, S., Song, B.C.: Vision transformer for small-size datasets (2021). Preprint ar**v:2112.13492
Google Scholar
Lim, L.S., Mitchell, P., Seddon, J.M., Holz, F.G., Wong, T.Y.: Age-related macular degeneration. Lancet 379(9827), 1728–1738 (2012)
Article Google Scholar
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Liu, Z., Hu, H., Lin, Y., Yao, Z., **e, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., et al.: Swin transformer v2: Scaling up capacity and resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12009–12019 (2022)
Google Scholar
Pachade, S., Porwal, P., Thulkar, D., Kokare, M., Deshmukh, G., Sahasrabuddhe, V., Giancardo, L., Quellec, G., Mériaudeau, F.: Retinal fundus multi-disease image dataset (RFMiD): a dataset for multi-disease detection research. Data 6(2), 14 (2021)
Article Google Scholar
Prangemeier, T., Reich, C., Koeppl, H.: Attention-based transformers for instance segmentation of cells in microstructures. In: 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 700–707. IEEE, Piscataway (2020)
Google Scholar
Ryan, S.J.: Retina, vol. 2. Elsevier Health Sciences, Amsterdam (2013)
Google Scholar
Ryoo, M., Piergiovanni, A., Arnab, A., Dehghani, M., Angelova, A.: TokenLearner: adaptive space-time tokenization for videos. Adv. Neural Informat. Process. Syst. 34, 12786–12797 (2021)
Google Scholar
Segre, L.: Human eye anatomy - parts of the eye explained (2022). https://www.allaboutvision.com/resources/anatomy.htm
Steinmetz, J.D., Bourne, R.R., Briant, P.S., Flaxman, S.R., Taylor, H.R., Jonas, J.B., Abdoli, A.A., Abrha, W.A., Abualhasan, A., Abu-Gharbieh, E.G., et al.: Causes of blindness and vision impairment in 2020 and trends over 30 years, and prevalence of avoidable blindness in relation to vision 2020: the right to sight: an analysis for the global burden of disease study. Lancet Global Health 9(2), e144–e160 (2021)
Article Google Scholar
Sudharson, S., Kokil, P.: Computer-aided diagnosis system for the classification of multi-class kidney abnormalities in the noisy ultrasound images. Comput. Methods Program. Biomed. 205, 106071 (2021)
Article Google Scholar
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Google Scholar
Touvron, H., Cord, M., Jégou, H.: DeiT III: Revenge of the ViT (2022). Preprint ar**v:2204.07118
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wu, J., Hu, R., **ao, Z., Chen, J., Liu, J.: Vision transformer-based recognition of diabetic retinopathy grade. Med. Phys. 48(12), 7850–7863 (2021)
Article Google Scholar
World Health Organization: World report on vision. World Health Organization (2019)
Google Scholar

Download references

Acknowledgements

The research reported in this chapter was supported by the Department of Science and Technology (DST) under the Fund for Improvement of S&T Infrastructure (FIST), Government of India, under the Grant no. SR/FST/ET-I/2020/578 and Science and Engineering Research Board (SERB), grant no. EEQ/2021/000804

Author information

Authors and Affiliations

Indian Institute of Information Technology, Design and Manufacturing, Advanced Signal and Image Processing (ASIP) Lab, Center for Healthcare Advancement, Research and Innovation (C-HARI), Department of Electronics and Communication Engineering, Kancheepuram, Chennai, India
Niranjana Vannadil & Priyanka Kokil

Authors

Niranjana Vannadil
View author publications
You can also search for this author in PubMed Google Scholar
Priyanka Kokil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Niranjana Vannadil .

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, National Institute of Technology Tiruchirappalli, Tiruchirappalli, Tamil Nadu, India
E. S. Gopi
Department of Electronics and Communication Engineering, National Institute of Technology Tiruchirappalli, Tiruchirappalli, Tamil Nadu, India
P Maheswaran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vannadil, N., Kokil, P. (2024). Automated Age-Related Macular Degeneration Diagnosis in Retinal Fundus Images via ViT. In: Gopi, E.S., Maheswaran, P. (eds) Proceedings of the International Conference on Machine Learning, Deep Learning and Computational Intelligence for Wireless Communication. MDCWC 2023. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-031-47942-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-47942-7_24
Published: 24 February 2012
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47941-0
Online ISBN: 978-3-031-47942-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics