Context-Based Masking for Spontaneous Venous Pulsations Detection

Sheng, Hongwei; Yu, **n; Li, Xue; Golzan, Mojtaba

doi:10.1007/978-981-99-8388-9_42

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14471))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

737 Accesses

Abstract

Spontaneous retinal venous pulsations (SVP) serve as vital dynamic biomarkers, representing rhythmic changes of the central retinal vein observed at the optic disc region (ODR) within an eye. SVPs serve as vital dynamic biomarkers, representing rhythmic changes of the central retinal vein observed at the optic disc region (ODR) within an eye. In light of their crucial clinical role, automatic detection of SVPs from fundus videos has become an area of burgeoning research. However, the inherent eye movements and the variability in retinal video quality present significant challenges to direct SVP detection via existing deep learning models. In response, we devise a spatio-temporal context-based masking approach (STC Masking), exploiting the spatiotemporal characteristics of SVPs to enhance their detection in retinal videos. We first apply a spatio-temporal mask to clip the video into an ODR-focused video tube. Diverging from conventional masking with gray or black blocks, we then employ a context masking method which using the original pixel values from video frames as the mask fill-in. The context mask map temporally transforms the dynamic video tubes into static tubes, thus changing the pulsation status of SVPs. Correspondingly, we adjust the SVP video labels based on the changing extent of masked regions to avoid ambiguity in data labelling. This innovative strategy provides more vivid videos which are similar to unmasked videos pixel-wise but having contrast semantics in SVP presenting regions. This enables network to capture the most discriminating regions through spatio-temporal variations, allowing explicit detection on SVP existence in the video. Our experiments illustrate the efficacy of our STC masking strategy, outperforming baseline methods. This work, thereby, underscores the potential of grid context-based masking for more accurate SVP detection in retinal video analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Autonomous assessment of spontaneous retinal venous pulsations in fundus videos using a deep learning framework

Article Open access 02 September 2023

At the Pulse of Time: Machine Vision in Retinal Videos

FetNet: a recurrent convolutional network for occlusion identification in fetoscopic videos

Article Open access 29 April 2020

References

Alomar, K., Aysel, H.I., Cai, X.: Data augmentation in classification and segmentation: a survey and new strategies. J. Imaging 9, 46 (2023)
Article Google Scholar
Beede, E., et al.: A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pp. 1–12 (2020)
Google Scholar
Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Google Scholar
Chen, C., Hammernik, K., Ouyang, C., Qin, C., Bai, W., Rueckert, D.: Cooperative training and latent space data augmentation for robust medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021, Part III. LNCS, vol. 12903, pp. 149–159. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_14
Chapter Google Scholar
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. ar**v preprint ar**v:1708.04552 (2017)
D’Antona, L., et al.: Association of intracranial pressure and spontaneous retinal venous pulsation. JAMA Neurol. 76(12), 1502–1505 (2019)
Article Google Scholar
Feichtenhofer, C.: X3D: expanding architectures for efficient video recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 203–213 (2020)
Google Scholar
Feichtenhofer, C., Fan, H., Li, Y., He, K.: Masked autoencoders as spatiotemporal learners. In: NeurIPS (2022)
Google Scholar
Guan, H., Liu, M.: Domain adaptation for medical image analysis: a survey. IEEE Trans. Biomed. Eng. 69(3), 1173–1185 (2022)
Article Google Scholar
Hamann, T., Wiest, M., Mislevics, A., Bondarenko, A., Zweifel, S.: At the pulse of time: machine vision in retinal videos. In: Staartjes, V.E., Regli, L., Serra, C. (eds.) Machine Learning in Clinical Neuroscience. ANS, vol. 134, pp. 303–311. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-85292-4_34
Chapter Google Scholar
Hedges Jr., T.R., Baron, E.M., Hedges III, T.R., Sinclair, S.H.: The retinal venous pulse: its relation to optic disc characteristics and choroidal pulse. Ophthalmology 101(3), 542–547 (1994)
Google Scholar
Hogarty, D.T., Hogarty, J.P., Hewitt, A.W.: Smartphone use in ophthalmology: what is their place in clinical practice? Surv. Ophthalmol. 65(2), 250–262 (2020)
Article Google Scholar
Iqbal, U.: Smartphone fundus photography: a narrative review. Int. J. Retina Vitreous 7(1), 44 (2021)
Article Google Scholar
Khan, M., et al.: RVD: a handheld device-based fundus video dataset for retinal vessel segmentation. ar**v preprint ar**v:2307.06577 (2023)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ar**v preprint ar**v:1412.6980 (2014)
Kong, Y., Fu, Y.: Human action recognition and prediction: a survey. Int. J. Comput. Vision 130(5), 1366–1401 (2022)
Article Google Scholar
Kumar Singh, K., Jae Lee, Y.: Hide-and-seek: forcing a network to be meticulous for weakly-supervised object and action localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3524–3533 (2017)
Google Scholar
Laurent, C., Hong, S.C., Cheyne, K.R., Ogbuehi, K.C.: The detection of spontaneous venous pulsation with smartphone video ophthalmoscopy. Clin. Ophthalmol. (Auckland NZ) 14, 331 (2020)
Google Scholar
Lin, J., Gan, C., Han, S.: TSM: temporal shift module for efficient video understanding. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7083–7093 (2019)
Google Scholar
Liu, J., Yu, X.: Few-shot weighted style matching for glaucoma detection. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds.) CICAI 2021. LNCS, vol. 13069, pp. 289–300. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-93046-2_25
Chapter Google Scholar
McHugh, J.A., D’Antona, L., Toma, A.K., Bremner, F.D.: Spontaneous venous pulsations detected with infrared videography. J. Neuroophthalmol. 40(2), 174–177 (2020)
Article Google Scholar
Monjur, M., Hoque, I.T., Hashem, T., Rakib, M.A., Kim, J.E., Ahamed, S.I.: Smartphone based fundus camera for the diagnosis of retinal diseases. Smart Health 19, 100177 (2021)
Article Google Scholar
Mueller, S., Karpova, S., Wintergerst, M.W.M., Murali, K., Shanmugam, M.P., Finger, R.P., Schultz, T.: Automated detection of diabetic retinopathy from smartphone fundus videos. In: Fu, H., Garvin, M.K., MacGillivray, T., Xu, Y., Zheng, Y. (eds.) OMIA 2020. LNCS, vol. 12069, pp. 83–92. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63419-3_9
Chapter Google Scholar
Neimark, D., Bar, O., Zohar, M., Asselmann, D.: Video transformer network. ar**v preprint ar**v:2102.00719 (2021)
Pujari, A., et al.: Clinical role of smartphone fundus imaging in diabetic retinopathy and other neuro-retinal diseases. Curr. Eye Res. 46(11), 1605–1613 (2021)
Article Google Scholar
Schneider, C.A., Rasband, W.S., Eliceiri, K.W.: NIH image to ImageJ: 25 years of image analysis. Nat. Methods 9(7), 671–675 (2012)
Article Google Scholar
Seo, J.H., Kim, T.W., Weinreb, R.N., Kim, Y.A., Kim, M.: Relationship of intraocular pressure and frequency of spontaneous retinal venous pulsation in primary open-angle glaucoma. Ophthalmology 119(11), 2254–2260 (2012)
Article Google Scholar
Sheng, H., et al.: Autonomous stabilization of retinal videos for streamlining assessment of spontaneous venous pulsations. ar**v preprint ar**v:2305.06043 (2023)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556 (2014)
Sun, Z., Ke, Q., Rahmani, H., Bennamoun, M., Wang, G., Liu, J.: Human action recognition from various data modalities: a review. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3200–3225 (2023)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, vol. 30 (2017)
Google Scholar
Wang, L., et al.: VideoMAE V2: scaling video masked autoencoders with dual masking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14549–14560 (2023)
Google Scholar
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2
Chapter Google Scholar
Wang, R., et al.: BEVT: BERT pretraining of video transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pp. 14713–14723 (2022)
Google Scholar
Wei, Y., et al.: MPP-net: multi-perspective perception network for dense video captioning. Neurocomputing 552, 126523 (2023)
Article Google Scholar
Wintergerst, M.W., Jansen, L.G., Holz, F.G., Finger, R.P.: Smartphone-based fundus imaging-where are we now? Asia-Pac. J. Ophthalmol. 9(4), 308–314 (2020)
Article Google Scholar
Wintergerst, M.W., et al.: Diabetic retinopathy screening using smartphone-based fundus imaging in India. Ophthalmology 127(11), 1529–1538 (2020)
Article Google Scholar
Yang, S., **ao, W., Zhang, M., Guo, S., Zhao, J., Shen, F.: Image data augmentation for deep learning: a survey. ar**v preprint ar**v:2204.08610 (2022)
Yao, Y., Wang, T., Du, H., Zheng, L., Gedeon, T.: Spotting visual keywords from temporal sliding windows. In: 2019 International Conference on Multimodal Interaction, pp. 536–539 (2019)
Google Scholar
Zhang, H.B., et al.: A comprehensive survey of vision-based human action recognition methods. Sensors 19(5), 1005 (2019)
Article Google Scholar
Zhang, H., Zhu, L., Wang, X., Yang, Y.: Divide and retain: a dual-phase modeling for long-tailed visual recognition. IEEE Trans. Neural Netw. Learn. Syst. (2023)
Google Scholar
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 13001–13008 (2020)
Google Scholar

Download references

Acknowledgements

This research is funded in part by ARC-Discovery grant (DP220100800 to XY) and ARC-DECRA grant (DE230100477 to XY). We thank all anonymous reviewers and ACs for their constructive suggestions.

Author information

Authors and Affiliations

The University of Queensland, Brisbane, QLD, 4067, Australia
Hongwei Sheng, **n Yu & Xue Li
University of Technology Sydney, Sydney, NSW, 2007, Australia
Hongwei Sheng & Mojtaba Golzan

Authors

Hongwei Sheng
View author publications
You can also search for this author in PubMed Google Scholar
**n Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xue Li
View author publications
You can also search for this author in PubMed Google Scholar
Mojtaba Golzan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to **n Yu .

Editor information

Editors and Affiliations

The University of Sydney, Darlington, NSW, Australia
Tongliang Liu
Monash University, Clayton, VIC, Australia
Geoff Webb
The University of Newcastle, Callaghan, NSW, Australia
Lin Yue
CSIRO Data61, Sydney, NSW, Australia
Dadong Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sheng, H., Yu, X., Li, X., Golzan, M. (2024). Context-Based Masking for Spontaneous Venous Pulsations Detection. In: Liu, T., Webb, G., Yue, L., Wang, D. (eds) AI 2023: Advances in Artificial Intelligence. AI 2023. Lecture Notes in Computer Science(), vol 14471. Springer, Singapore. https://doi.org/10.1007/978-981-99-8388-9_42

Download citation

DOI: https://doi.org/10.1007/978-981-99-8388-9_42
Published: 27 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8387-2
Online ISBN: 978-981-99-8388-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Context-Based Masking for Spontaneous Venous Pulsations Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Autonomous assessment of spontaneous retinal venous pulsations in fundus videos using a deep learning framework

At the Pulse of Time: Machine Vision in Retinal Videos

FetNet: a recurrent convolutional network for occlusion identification in fetoscopic videos

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Context-Based Masking for Spontaneous Venous Pulsations Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Autonomous assessment of spontaneous retinal venous pulsations in fundus videos using a deep learning framework

At the Pulse of Time: Machine Vision in Retinal Videos

FetNet: a recurrent convolutional network for occlusion identification in fetoscopic videos

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation