Abstract
Age-related Macular Degeneration (AMD) is a visual impairment condition that commonly affects elderly individuals, and it is a top cause of vision loss in individuals over the age of 60. Early detection can help identify a treatment plan to slow the progress of the disease and prevent severe loss of vision, but this requires comprehensive eye examinations and accurate diagnosis of the disease. The diagnosis is subject to human error. This chapter proposes the use of an automated detection system that circumvents these issues. Multiple deep learning techniques are explored in order to identify an optimum algorithm for automated detection. Deep learning algorithms, particularly Convolution Neural Networks (CNNs), are the front-runners for medical image classification and detection problems. While CNN models are accurate, they are also computationally intensive. This chapter explores the possibility of a lightweight deep learning architecture namely Vision Transformer (ViT) and its variants to diagnose AMD. A comparison is done between state-of-the-art CNN models including AlexNet, MobileNet, and XCeption model and the transformer models including ViT, Modified ViT, and Swin Transformer. The comparison is done on the basis of accuracy, average training time (ATT), CPU utilization, and GPU utilization in order to identify models that can be used for mobile applications with good accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Apostolidis, K.D., Papakostas, G.A.: A survey on adversarial deep learning robustness in medical image analysis. Electronics 10(17), 2132 (2021)
Chollet, F.: XCeption: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Dai, Y., Gao, Y., Liu, F.: TransMed: transformers advance multi-modal medical image classification. Diagnostics 11(8), 1384 (2021)
Dai, Z., Liu, H., Le, Q.V., Tan, M.: CoAtNet: marrying convolution and attention for all data sizes. Adv. Neural Informat. Process. Syst. 34, 3965–3977 (2021)
De Jong, P.T.: Age-related macular degeneration. New England J. Med. 355(14), 1474–1485 (2006)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale (2020). Preprint ar**v:2010.11929
Gheflati, B., Rivaz, H.: Vision transformer for classification of breast ultrasound images (2021). Preprint ar**v:2110.14731
He, K., Gan, C., Li, Z., Rekik, I., Yin, Z., Ji, W., Gao, Y., Wang, Q., Zhang, J., Shen, D.: Transformers in medical image analysis: A review (2022). Preprint ar**v:2202.12165
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient convolutional neural networks for mobile vision applications (2017). Preprint ar**v:1704.04861
Karimi, D., Vasylechko, S.D., Gholipour, A.: Convolution-free medical image segmentation using transformers. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 78–88. Springer, Berlin (2021)
Kokil, P., Pratap, T.: Additive white gaussian noise level estimation for natural images using linear scale-space features. Circuits Syst. Signal Process. 40(1), 353–374 (2021)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Lee, S.H., Lee, S., Song, B.C.: Vision transformer for small-size datasets (2021). Preprint ar**v:2112.13492
Lim, L.S., Mitchell, P., Seddon, J.M., Holz, F.G., Wong, T.Y.: Age-related macular degeneration. Lancet 379(9827), 1728–1738 (2012)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Liu, Z., Hu, H., Lin, Y., Yao, Z., **e, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., Dong, L., et al.: Swin transformer v2: Scaling up capacity and resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12009–12019 (2022)
Pachade, S., Porwal, P., Thulkar, D., Kokare, M., Deshmukh, G., Sahasrabuddhe, V., Giancardo, L., Quellec, G., Mériaudeau, F.: Retinal fundus multi-disease image dataset (RFMiD): a dataset for multi-disease detection research. Data 6(2), 14 (2021)
Prangemeier, T., Reich, C., Koeppl, H.: Attention-based transformers for instance segmentation of cells in microstructures. In: 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 700–707. IEEE, Piscataway (2020)
Ryan, S.J.: Retina, vol. 2. Elsevier Health Sciences, Amsterdam (2013)
Ryoo, M., Piergiovanni, A., Arnab, A., Dehghani, M., Angelova, A.: TokenLearner: adaptive space-time tokenization for videos. Adv. Neural Informat. Process. Syst. 34, 12786–12797 (2021)
Segre, L.: Human eye anatomy - parts of the eye explained (2022). https://www.allaboutvision.com/resources/anatomy.htm
Steinmetz, J.D., Bourne, R.R., Briant, P.S., Flaxman, S.R., Taylor, H.R., Jonas, J.B., Abdoli, A.A., Abrha, W.A., Abualhasan, A., Abu-Gharbieh, E.G., et al.: Causes of blindness and vision impairment in 2020 and trends over 30 years, and prevalence of avoidable blindness in relation to vision 2020: the right to sight: an analysis for the global burden of disease study. Lancet Global Health 9(2), e144–e160 (2021)
Sudharson, S., Kokil, P.: Computer-aided diagnosis system for the classification of multi-class kidney abnormalities in the noisy ultrasound images. Comput. Methods Program. Biomed. 205, 106071 (2021)
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Touvron, H., Cord, M., Jégou, H.: DeiT III: Revenge of the ViT (2022). Preprint ar**v:2204.07118
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wu, J., Hu, R., **ao, Z., Chen, J., Liu, J.: Vision transformer-based recognition of diabetic retinopathy grade. Med. Phys. 48(12), 7850–7863 (2021)
World Health Organization: World report on vision. World Health Organization (2019)
Acknowledgements
The research reported in this chapter was supported by the Department of Science and Technology (DST) under the Fund for Improvement of S&T Infrastructure (FIST), Government of India, under the Grant no. SR/FST/ET-I/2020/578 and Science and Engineering Research Board (SERB), grant no. EEQ/2021/000804
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Vannadil, N., Kokil, P. (2024). Automated Age-Related Macular Degeneration Diagnosis in Retinal Fundus Images via ViT. In: Gopi, E.S., Maheswaran, P. (eds) Proceedings of the International Conference on Machine Learning, Deep Learning and Computational Intelligence for Wireless Communication. MDCWC 2023. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-031-47942-7_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-47942-7_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47941-0
Online ISBN: 978-3-031-47942-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)