Abstract
Spontaneous retinal venous pulsations (SVP) serve as vital dynamic biomarkers, representing rhythmic changes of the central retinal vein observed at the optic disc region (ODR) within an eye. SVPs serve as vital dynamic biomarkers, representing rhythmic changes of the central retinal vein observed at the optic disc region (ODR) within an eye. In light of their crucial clinical role, automatic detection of SVPs from fundus videos has become an area of burgeoning research. However, the inherent eye movements and the variability in retinal video quality present significant challenges to direct SVP detection via existing deep learning models. In response, we devise a spatio-temporal context-based masking approach (STC Masking), exploiting the spatiotemporal characteristics of SVPs to enhance their detection in retinal videos. We first apply a spatio-temporal mask to clip the video into an ODR-focused video tube. Diverging from conventional masking with gray or black blocks, we then employ a context masking method which using the original pixel values from video frames as the mask fill-in. The context mask map temporally transforms the dynamic video tubes into static tubes, thus changing the pulsation status of SVPs. Correspondingly, we adjust the SVP video labels based on the changing extent of masked regions to avoid ambiguity in data labelling. This innovative strategy provides more vivid videos which are similar to unmasked videos pixel-wise but having contrast semantics in SVP presenting regions. This enables network to capture the most discriminating regions through spatio-temporal variations, allowing explicit detection on SVP existence in the video. Our experiments illustrate the efficacy of our STC masking strategy, outperforming baseline methods. This work, thereby, underscores the potential of grid context-based masking for more accurate SVP detection in retinal video analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alomar, K., Aysel, H.I., Cai, X.: Data augmentation in classification and segmentation: a survey and new strategies. J. Imaging 9, 46 (2023)
Beede, E., et al.: A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–12 (2020)
Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Chen, C., Hammernik, K., Ouyang, C., Qin, C., Bai, W., Rueckert, D.: Cooperative training and latent space data augmentation for robust medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021, Part III. LNCS, vol. 12903, pp. 149–159. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_14
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. ar**v preprint ar**v:1708.04552 (2017)
D’Antona, L., et al.: Association of intracranial pressure and spontaneous retinal venous pulsation. JAMA Neurol. 76(12), 1502–1505 (2019)
Feichtenhofer, C.: X3D: expanding architectures for efficient video recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 203–213 (2020)
Feichtenhofer, C., Fan, H., Li, Y., He, K.: Masked autoencoders as spatiotemporal learners. In: NeurIPS (2022)
Guan, H., Liu, M.: Domain adaptation for medical image analysis: a survey. IEEE Trans. Biomed. Eng. 69(3), 1173–1185 (2022)
Hamann, T., Wiest, M., Mislevics, A., Bondarenko, A., Zweifel, S.: At the pulse of time: machine vision in retinal videos. In: Staartjes, V.E., Regli, L., Serra, C. (eds.) Machine Learning in Clinical Neuroscience. ANS, vol. 134, pp. 303–311. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-85292-4_34
Hedges Jr., T.R., Baron, E.M., Hedges III, T.R., Sinclair, S.H.: The retinal venous pulse: its relation to optic disc characteristics and choroidal pulse. Ophthalmology 101(3), 542–547 (1994)
Hogarty, D.T., Hogarty, J.P., Hewitt, A.W.: Smartphone use in ophthalmology: what is their place in clinical practice? Surv. Ophthalmol. 65(2), 250–262 (2020)
Iqbal, U.: Smartphone fundus photography: a narrative review. Int. J. Retina Vitreous 7(1), 44 (2021)
Khan, M., et al.: RVD: a handheld device-based fundus video dataset for retinal vessel segmentation. ar**v preprint ar**v:2307.06577 (2023)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ar**v preprint ar**v:1412.6980 (2014)
Kong, Y., Fu, Y.: Human action recognition and prediction: a survey. Int. J. Comput. Vision 130(5), 1366–1401 (2022)
Kumar Singh, K., Jae Lee, Y.: Hide-and-seek: forcing a network to be meticulous for weakly-supervised object and action localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3524–3533 (2017)
Laurent, C., Hong, S.C., Cheyne, K.R., Ogbuehi, K.C.: The detection of spontaneous venous pulsation with smartphone video ophthalmoscopy. Clin. Ophthalmol. (Auckland NZ) 14, 331 (2020)
Lin, J., Gan, C., Han, S.: TSM: temporal shift module for efficient video understanding. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7083–7093 (2019)
Liu, J., Yu, X.: Few-shot weighted style matching for glaucoma detection. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds.) CICAI 2021. LNCS, vol. 13069, pp. 289–300. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93046-2_25
McHugh, J.A., D’Antona, L., Toma, A.K., Bremner, F.D.: Spontaneous venous pulsations detected with infrared videography. J. Neuroophthalmol. 40(2), 174–177 (2020)
Monjur, M., Hoque, I.T., Hashem, T., Rakib, M.A., Kim, J.E., Ahamed, S.I.: Smartphone based fundus camera for the diagnosis of retinal diseases. Smart Health 19, 100177 (2021)
Mueller, S., Karpova, S., Wintergerst, M.W.M., Murali, K., Shanmugam, M.P., Finger, R.P., Schultz, T.: Automated detection of diabetic retinopathy from smartphone fundus videos. In: Fu, H., Garvin, M.K., MacGillivray, T., Xu, Y., Zheng, Y. (eds.) OMIA 2020. LNCS, vol. 12069, pp. 83–92. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63419-3_9
Neimark, D., Bar, O., Zohar, M., Asselmann, D.: Video transformer network. ar**v preprint ar**v:2102.00719 (2021)
Pujari, A., et al.: Clinical role of smartphone fundus imaging in diabetic retinopathy and other neuro-retinal diseases. Curr. Eye Res. 46(11), 1605–1613 (2021)
Schneider, C.A., Rasband, W.S., Eliceiri, K.W.: NIH image to ImageJ: 25 years of image analysis. Nat. Methods 9(7), 671–675 (2012)
Seo, J.H., Kim, T.W., Weinreb, R.N., Kim, Y.A., Kim, M.: Relationship of intraocular pressure and frequency of spontaneous retinal venous pulsation in primary open-angle glaucoma. Ophthalmology 119(11), 2254–2260 (2012)
Sheng, H., et al.: Autonomous stabilization of retinal videos for streamlining assessment of spontaneous venous pulsations. ar**v preprint ar**v:2305.06043 (2023)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556 (2014)
Sun, Z., Ke, Q., Rahmani, H., Bennamoun, M., Wang, G., Liu, J.: Human action recognition from various data modalities: a review. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3200–3225 (2023)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)
Wang, L., et al.: VideoMAE V2: scaling video masked autoencoders with dual masking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14549–14560 (2023)
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2
Wang, R., et al.: BEVT: BERT pretraining of video transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pp. 14713–14723 (2022)
Wei, Y., et al.: MPP-net: multi-perspective perception network for dense video captioning. Neurocomputing 552, 126523 (2023)
Wintergerst, M.W., Jansen, L.G., Holz, F.G., Finger, R.P.: Smartphone-based fundus imaging-where are we now? Asia-Pac. J. Ophthalmol. 9(4), 308–314 (2020)
Wintergerst, M.W., et al.: Diabetic retinopathy screening using smartphone-based fundus imaging in India. Ophthalmology 127(11), 1529–1538 (2020)
Yang, S., **ao, W., Zhang, M., Guo, S., Zhao, J., Shen, F.: Image data augmentation for deep learning: a survey. ar**v preprint ar**v:2204.08610 (2022)
Yao, Y., Wang, T., Du, H., Zheng, L., Gedeon, T.: Spotting visual keywords from temporal sliding windows. In: 2019 International Conference on Multimodal Interaction, pp. 536–539 (2019)
Zhang, H.B., et al.: A comprehensive survey of vision-based human action recognition methods. Sensors 19(5), 1005 (2019)
Zhang, H., Zhu, L., Wang, X., Yang, Y.: Divide and retain: a dual-phase modeling for long-tailed visual recognition. IEEE Trans. Neural Netw. Learn. Syst. (2023)
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 13001–13008 (2020)
Acknowledgements
This research is funded in part by ARC-Discovery grant (DP220100800 to XY) and ARC-DECRA grant (DE230100477 to XY). We thank all anonymous reviewers and ACs for their constructive suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sheng, H., Yu, X., Li, X., Golzan, M. (2024). Context-Based Masking for Spontaneous Venous Pulsations Detection. In: Liu, T., Webb, G., Yue, L., Wang, D. (eds) AI 2023: Advances in Artificial Intelligence. AI 2023. Lecture Notes in Computer Science(), vol 14471. Springer, Singapore. https://doi.org/10.1007/978-981-99-8388-9_42
Download citation
DOI: https://doi.org/10.1007/978-981-99-8388-9_42
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8387-2
Online ISBN: 978-981-99-8388-9
eBook Packages: Computer ScienceComputer Science (R0)