Boosting Breast Ultrasound Video Classification by the Guidance of Keyframe Feature Centers

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14224))

  • 3835 Accesses

Abstract

Breast ultrasound videos contain richer information than ultrasound images, therefore it is more meaningful to develop video models for this diagnosis task. However, the collection of ultrasound video datasets is much harder. In this paper, we explore the feasibility of enhancing the performance of ultrasound video classification using the static image dataset. To this end, we propose KGA-Net and coherence loss. The KGA-Net adopts both video clips and static images to train the network. The coherence loss uses the feature centers generated by the static images to guide the frame attention in the video model. Our KGA-Net boosts the performance on the public BUSV dataset by a large margin. The visualization results of frame attention prove the explainability of our method. We release the code and model weights in https://github.com/PlayerSAL/KGA-Net.

A. Sun and Z. Zhang—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Al-Dhabyani, W., Gomaa, M., Khaled, H., Fahmy, A.: Dataset of breast ultrasound images. Data Brief 28, 104863 (2020)

    Article  Google Scholar 

  2. Byra, M.: Breast mass classification with transfer learning based on scaling of deep representations. Biomed. Signal Process. Control 69, 102828 (2021)

    Article  Google Scholar 

  3. Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)

    Google Scholar 

  4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  5. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. ar**v preprint ar**v:2010.11929 (2020)

  6. Eroğlu, Y., Yildirim, M., Çinar, A.: Convolutional neural networks based classification of breast ultrasonography images by hybrid method with respect to benign, malignant, and normal using mRMR. Comput. Biol. Med. 133, 104407 (2021)

    Article  Google Scholar 

  7. Fan, H., et al.: Multiscale vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6824–6835 (2021)

    Google Scholar 

  8. Feichtenhofer, C., Fan, H., Malik, J., He, K.: SlowFast networks for video recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6202–6211 (2019)

    Google Scholar 

  9. Feichtenhofer, C., Pinz, A., Wildes, R.: Spatiotemporal residual networks for video action recognition. In: Advances in Neural Information Processing Systems (NIPS), pp. 3468–3476 (2016)

    Google Scholar 

  10. Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  11. Gheflati, B., Rivaz, H.: Vision transformers for classification of breast ultrasound images. In: 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 480–483. IEEE (2022)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  13. Huang, R., et al.: Extracting keyframes of breast ultrasound video using deep reinforcement learning. Med. Image Anal. 80, 102490 (2022)

    Article  Google Scholar 

  14. Lin, Z., Huang, R., Ni, D., Wu, J., Luo, B.: Masked video modeling with correlation-aware contrastive learning for breast cancer diagnosis in ultrasound. In: Xu, X., Li, X., Mahapatra, D., Cheng, L., Petitjean, C., Fu, H. (eds.) Resource-Efficient Medical Image Analysis: First MICCAI Workshop, REMIA 2022, Singapore, 22 September 2022, Proceedings, pp. 105–114. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16876-5_11

  15. Lin, Z., Lin, J., Zhu, L., Fu, H., Qin, J., Wang, L.: A new dataset and a baseline model for breast lesion detection in ultrasound videos. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention-MICCAI 2022: 25th International Conference, Singapore, 18–22 September 2022, Proceedings, Part III, pp. 614–623. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16437-8_59

  16. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  17. Liu, Z., et al.: Video swin transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3202–3211 (2022)

    Google Scholar 

  18. Moon, W.K., Lee, Y.W., Ke, H.H., Lee, S.H., Huang, C.S., Chang, R.F.: Computer-aided diagnosis of breast ultrasound images using ensemble learning from convolutional neural networks. Comput. Methods Programs Biomed. 190, 105361 (2020)

    Article  Google Scholar 

  19. Podda, A.S., Balia, R., Barra, S., Carta, S., Fenu, G., Piano, L.: Fully-automated deep learning pipeline for segmentation and classification of breast ultrasound images. J. Comput. Sci. 63, 101816 (2022)

    Article  Google Scholar 

  20. Siegel, R.L., et al.: Colorectal cancer statistics, 2017. CA: Cancer J. Clin. 67(3), 177–193 (2017)

    Google Scholar 

  21. Tran, D., Wang, H., Torresani, L., Feiszli, M.: Video classification with channel-separated convolutional networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5552–5561 (2019)

    Google Scholar 

  22. Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., Paluri, M.: A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 6450–6459 (2018)

    Google Scholar 

  23. Wang, J., et al.: Information bottleneck-based interpretable multitask network for breast cancer classification and segmentation. Med. Image Anal. 83, 102687 (2023)

    Article  Google Scholar 

  24. Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2

    Chapter  Google Scholar 

  25. Wang, Y., et al.: Key-frame guided network for thyroid nodule recognition using ultrasound videos. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention-MICCAI 2022: 25th International Conference, Singapore, 18–22 September 2022, Proceedings, Part IV, pp. 238–247. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16440-8_23

  26. Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31

    Chapter  Google Scholar 

  27. Zhang, G., Zhao, K., Hong, Y., Qiu, X., Zhang, K., Wei, B.: SHA-MTL: soft and hard attention multi-task learning for automated breast cancer ultrasound image segmentation and classification. Int. J. Comput. Assist. Radiol. Surg. 16, 1719–1725 (2021)

    Article  Google Scholar 

  28. Zhang, Y., et al.: BUSIS: a benchmark for breast ultrasound image segmentation. In: Healthcare, vol. 10, p. 729. MDPI (2022)

    Google Scholar 

Download references

Acknowledgements

This work is supported by National Key R &D Program of China (2022ZD0114900) and National Science Foundation of China (NSFC62276005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liwei Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, A., Zhang, Z., Lei, M., Dai, Y., Wang, D., Wang, L. (2023). Boosting Breast Ultrasound Video Classification by the Guidance of Keyframe Feature Centers. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14224. Springer, Cham. https://doi.org/10.1007/978-3-031-43904-9_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43904-9_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43903-2

  • Online ISBN: 978-3-031-43904-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation