Abstract

Age-related Macular Degeneration (AMD) is a visual impairment condition that commonly affects elderly individuals, and it is a top cause of vision loss in individuals over the age of 60. Early detection can help identify a treatment plan to slow the progress of the disease and prevent severe loss of vision, but this requires comprehensive eye examinations and accurate diagnosis of the disease. The diagnosis is subject to human error. This chapter proposes the use of an automated detection system that circumvents these issues. Multiple deep learning techniques are explored in order to identify an optimum algorithm for automated detection. Deep learning algorithms, particularly Convolution Neural Networks (CNNs), are the front-runners for medical image classification and detection problems. While CNN models are accurate, they are also computationally intensive. This chapter explores the possibility of a lightweight deep learning architecture namely Vision Transformer (ViT) and its variants to diagnose AMD. A comparison is done between state-of-the-art CNN models including AlexNet, MobileNet, and XCeption model and the transformer models including ViT, Modified ViT, and Swin Transformer. The comparison is done on the basis of accuracy, average training time (ATT), CPU utilization, and GPU utilization in order to identify models that can be used for mobile applications with good accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (Canada)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 299.99
Price excludes VAT (Canada)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Apostolidis, K.D., Papakostas, G.A.: A survey on adversarial deep learning robustness in medical image analysis. Electronics 10(17), 2132 (2021)

    Article  Google Scholar 

  2. Chollet, F.: XCeption: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)

    Google Scholar 

  3. Dai, Y., Gao, Y., Liu, F.: TransMed: transformers advance multi-modal medical image classification. Diagnostics 11(8), 1384 (2021)

    Article  Google Scholar 

  4. Dai, Z., Liu, H., Le, Q.V., Tan, M.: CoAtNet: marrying convolution and attention for all data sizes. Adv. Neural Informat. Process. Syst. 34, 3965–3977 (2021)

    Google Scholar 

  5. De Jong, P.T.: Age-related macular degeneration. New England J. Med. 355(14), 1474–1485 (2006)

    Article  Google Scholar 

  6. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale (2020). Preprint ar**v:2010.11929

    Google Scholar 

  7. Gheflati, B., Rivaz, H.: Vision transformer for classification of breast ultrasound images (2021). Preprint ar**v:2110.14731

    Google Scholar 

  8. He, K., Gan, C., Li, Z., Rekik, I., Yin, Z., Ji, W., Gao, Y., Wang, Q., Zhang, J., Shen, D.: Transformers in medical image analysis: A review (2022). Preprint ar**v:2202.12165

    Google Scholar 

  9. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient convolutional neural networks for mobile vision applications (2017). Preprint ar**v:1704.04861

    Google Scholar 

  10. Karimi, D., Vasylechko, S.D., Gholipour, A.: Convolution-free medical image segmentation using transformers. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 78–88. Springer, Berlin (2021)

    Google Scholar 

  11. Kokil, P., Pratap, T.: Additive white gaussian noise level estimation for natural images using linear scale-space features. Circuits Syst. Signal Process. 40(1), 353–374 (2021)

    Article  Google Scholar 

  12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)

    Google Scholar 

  13. Lee, S.H., Lee, S., Song, B.C.: Vision transformer for small-size datasets (2021). Preprint ar**v:2112.13492

    Google Scholar 

  14. Lim, L.S., Mitchell, P., Seddon, J.M., Holz, F.G., Wong, T.Y.: Age-related macular degeneration. Lancet 379(9827), 1728–1738 (2012)

    Article  Google Scholar 

  15. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  16. Liu, Z., Hu, H., Lin, Y., Yao, Z., **e, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., et al.: Swin transformer v2: Scaling up capacity and resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12009–12019 (2022)

    Google Scholar 

  17. Pachade, S., Porwal, P., Thulkar, D., Kokare, M., Deshmukh, G., Sahasrabuddhe, V., Giancardo, L., Quellec, G., Mériaudeau, F.: Retinal fundus multi-disease image dataset (RFMiD): a dataset for multi-disease detection research. Data 6(2), 14 (2021)

    Article  Google Scholar 

  18. Prangemeier, T., Reich, C., Koeppl, H.: Attention-based transformers for instance segmentation of cells in microstructures. In: 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 700–707. IEEE, Piscataway (2020)

    Google Scholar 

  19. Ryan, S.J.: Retina, vol. 2. Elsevier Health Sciences, Amsterdam (2013)

    Google Scholar 

  20. Ryoo, M., Piergiovanni, A., Arnab, A., Dehghani, M., Angelova, A.: TokenLearner: adaptive space-time tokenization for videos. Adv. Neural Informat. Process. Syst. 34, 12786–12797 (2021)

    Google Scholar 

  21. Segre, L.: Human eye anatomy - parts of the eye explained (2022). https://www.allaboutvision.com/resources/anatomy.htm

  22. Steinmetz, J.D., Bourne, R.R., Briant, P.S., Flaxman, S.R., Taylor, H.R., Jonas, J.B., Abdoli, A.A., Abrha, W.A., Abualhasan, A., Abu-Gharbieh, E.G., et al.: Causes of blindness and vision impairment in 2020 and trends over 30 years, and prevalence of avoidable blindness in relation to vision 2020: the right to sight: an analysis for the global burden of disease study. Lancet Global Health 9(2), e144–e160 (2021)

    Article  Google Scholar 

  23. Sudharson, S., Kokil, P.: Computer-aided diagnosis system for the classification of multi-class kidney abnormalities in the noisy ultrasound images. Comput. Methods Program. Biomed. 205, 106071 (2021)

    Article  Google Scholar 

  24. Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)

    Google Scholar 

  25. Touvron, H., Cord, M., Jégou, H.: DeiT III: Revenge of the ViT (2022). Preprint ar**v:2204.07118

    Google Scholar 

  26. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  27. Wu, J., Hu, R., **ao, Z., Chen, J., Liu, J.: Vision transformer-based recognition of diabetic retinopathy grade. Med. Phys. 48(12), 7850–7863 (2021)

    Article  Google Scholar 

  28. World Health Organization: World report on vision. World Health Organization (2019)

    Google Scholar 

Download references

Acknowledgements

The research reported in this chapter was supported by the Department of Science and Technology (DST) under the Fund for Improvement of S&T Infrastructure (FIST), Government of India, under the Grant no. SR/FST/ET-I/2020/578 and Science and Engineering Research Board (SERB), grant no. EEQ/2021/000804

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Niranjana Vannadil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Cite this paper

Vannadil, N., Kokil, P. (2024). Automated Age-Related Macular Degeneration Diagnosis in Retinal Fundus Images via ViT. In: Gopi, E.S., Maheswaran, P. (eds) Proceedings of the International Conference on Machine Learning, Deep Learning and Computational Intelligence for Wireless Communication. MDCWC 2023. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-031-47942-7_24

Download citation

Publish with us

Policies and ethics

Navigation