Log in

Real-time water surface target detection based on improved YOLOv7 for Chengdu Sand River

  • Research
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

It has been a challenge to obtain accurate detection results in a timely manner when faced with complex and changing surface target detection. Detecting targets on water surfaces in real-time can be challenging due to their rapid movement, small size, and fragmented appearance. In addition, traditional detection methods are often labor-intensive and time-consuming, especially when dealing with large water bodies such as rivers and lakes. This paper presents an improved water surface target detection algorithm that is based on the YOLOv7 (you only look once) model to enhance the performance of water surface target detection. We have enhanced the accuracy and speed of detecting surface targets by making improvements to three key structures: the network aggregation structure, the pyramid pooling structure, and the down-sampling structure. Furthermore, we implemented the model on mobile devices and designed a detection software. The software enables real-time detection through images and videos. The experimental results demonstrate that the improved model outperforms the original YOLOv7 model. It exhibits a 6.4% boost in accuracy, a 4.2% improvement in recall, a 4.1% increase in mAP, a 14.3% reduction in parameter counts, and archives the FPS of 87. The software has the ability to accurately recognize 11 typical targets on the water surface and demonstrates excellent water surface target detection capability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Data availability

No datasets were generated or analysed during the current study.

References

  1. Chen, W., Li, L., Xu, L.: Changes in Sand River open the "gate" of Chengdu's river ecological improvement. Chengdu Daily, 2006-12-04(A06) (2006)

  2. Chen, Y.: The Green Great Wall of Chengdu-New Sand River. Disast. Prev. Expo. 5, 19–20 (2005)

    Google Scholar 

  3. Liang, S., Yang, X.: Analyzing the design of urban waterfront ecological barge—taking the transformation of Chengdu Sand River water system as an example. Mod. Hortic. 8, 86 (2014)

    Google Scholar 

  4. Dudgeon, D., Arthington, A.H., Gessner, M.O., Kawabata, Z.I., Knowler, D.J., Lévêque, C., Naiman, R.J., Prieur-Richard, A.H., Soto, D., Stiassny, M.L., Sullivan, C.A.: Freshwater biodiversity: importance, threats, status and conservation challenges. Biol. Rev. 81, 163–182 (2006)

    Article  Google Scholar 

  5. Abbe, E., Sandon, C.: On the universality of deep learning. Adv. Neural. Inf. Process. Syst. 33, 20061–20072 (2020)

    Google Scholar 

  6. Silva, S.H., Najafirad, P.: Opportunities and challenges in deep learning adversarial robustness: a survey. ar**v:2007.00753 (2020)

  7. Yang, Y.Y., Rashtchian, C., Zhang, H., Salakhutdinov, R.R., Chaudhuri, K.: A closer look at accuracy vs. robustness. Adv. Neural. Inf. Process. Syst. 33, 8588–8601 (2020)

    Google Scholar 

  8. Ju, M., Luo, H., Wang, Z., Hui, B., Chang, Z.: The application of improved YOLO V3 in multi-scale target detection. Appl. Sci. 9, 3775 (2019)

    Article  Google Scholar 

  9. Xu, Q., Lin, R., Yue, H., Huang, H., Yang, Y., Yao, Z.: Research on small target detection in driving scenarios based on improved yolo network. IEEE Access 8, 27574–27583 (2020)

    Article  Google Scholar 

  10. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)

  11. Joseph, E.C., Bamisile, O., Ugochi, N., Zhen, Q., Ilakoze, N., Ijeoma, C.: Systematic advancement of YOLO object detector for real-time detection of objects. In 2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 279–284 (2021)

  12. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)

  13. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. ar**v:1804.02767 (2018)

  14. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. ar**v:2004.10934 (2020)

  15. Jocher, J.: YOLOv5 release v6.1. https://github.com/ultralytics/yolov5/releases/tag/v6.1/ (2022). Accessed 22 Feb 2022

  16. Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W., Li, Y.: YOLOv6: a single-stage object detection framework for industrial applications. ar**v:2209.02976 (2022)

  17. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. ar**v:2207.02696 (2022)

  18. Yu, Y., Zhao, J., Gong, Q., Huang, C., Zheng, G., Ma, J.: Real-time underwater maritime object detection in side-scan sonar images based on transformer-YOLOv5. Remote Sens. 13, 3555 (2021)

    Article  Google Scholar 

  19. Li, X., Tian, M., Kong, S., Wu, L., Yu, J.: A modified YOLOv3 detection method for vision-based water surface garbage capture robot. Int. J. Adv. Rob. Syst. 17, 1729881420932715 (2020)

    Google Scholar 

  20. Fraser, S., Nikora, V., Williamson, B.J., Scott, B.E.: Automatic active acoustic target detection in turbulent aquatic environments. Limnol. Oceanogr. Methods 15, 184–199 (2017)

    Article  Google Scholar 

  21. Li, R., Wu, J., Cao, L.: Ship target detection of unmanned surface vehicle base on efficientdet. Syst. Sci. Control Eng. 10, 264–271 (2022)

    Article  Google Scholar 

  22. Yuan, X., Guo, L., Luo, C., Zhou, X., Yu, C.: A survey of target detection and recognition methods in underwater turbid areas. Appl. Sci. 12, 4898 (2022)

    Article  Google Scholar 

  23. Mohamed, H.E.D., Fadl, A., Anas, O., Wageeh, Y., ElMasry, N., Nabil, A., Atia, A.: Msr-yolo: method to enhance fish detection and tracking in fish farms. Procedia Comput. Sci. 170, 539–546 (2020)

    Article  Google Scholar 

  24. Ma, Z., Zeng, Y., Wu, L., Zhang, L., Li, J., Li, H.: Water surface targets recognition and tracking based on improved YOLO and KCF algorithms. In: 2021 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1460–1465 (2021)

  25. Yan, J., Zhou, Z., Zhou, D., Su, B., Xuanyuan, Z., Tang, J., Lai, Y., Chen, J., Liang, W.: Underwater object detection algorithm based on attention mechanism and cross-stage partial fast spatial pyramidal pooling. Front. Mar. Sci. 9, 1056300 (2022)

    Article  Google Scholar 

  26. Yang, Y., Chen, L., Zhang, J., Long, L., Wang, Z.: UGC-YOLO: underwater environment object detection based on YOLO with a global context block. J. Ocean Univ. China 22, 665–674 (2023)

    Article  Google Scholar 

  27. Cheng, L., Deng, B., Yang, Y., Lyu, J., Zhao, J., Zhou, K., Yang, C., Wang, L., Yang, S., He, Y.: Water target recognition method and application for unmanned surface vessels. IEEE Access 10, 421–434 (2021)

    Article  Google Scholar 

  28. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. ar**v:1706.03762 (2017)

  29. **ao, X., Zhang, D., Hu, G., Jiang, Y., **a, S.: CNN–MHSA: a convolutional neural network and multi-head self-attention combined approach for detecting phishing websites. Neural Netw. 125, 303–312 (2020)

    Article  Google Scholar 

  30. Li, P., Zheng, J., Li, P., Long, H., Li, M., Gao, L.: Tomato maturity detection and counting model based on MHSA-YOLOv8. Sensors. 23, 6701 (2023)

    Article  Google Scholar 

  31. Sunkara, R., Luo, T.: No more strided convolutions or pooling: a new CNN building block for low-resolution images and small objects. ar**v:2208.03641 (2022)

  32. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1904–1916 (2015)

    Article  Google Scholar 

  33. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-yolov4: scaling cross stage partial network. In: Proceedings of the IEEE/cvf Conference on Computer Vision and Pattern Recognition, pp. 13029–13038 (2021)

  34. **, T., Bercea, G.T., Le, T.D., Chen, T., Su, G., Imai, H., Negishi, Y., Leu, A., O'Brien, K., Kawachiya, K., Eichenberger, A.E.: Compiling onnx neural network models using mlir. ar**v:2008.08272 (2020)

  35. Li, J., Xu, Y., Li, Y., Qi, K., Yu, F., Sun, S.: Research on intelligent recognition solution of tobacco disease on android platform. In: 2022 International Conference on Automation, Robotics and Computer Engineering (ICARCE), pp. 1–4 (2022)

  36. Guo, Y., Lu, Y., Guo, Y., Liu, R.W., Chui, K.T.: Intelligent vision-enabled detection of water-surface targets for video surveillance in maritime transportation. J. Adv. Transp. 1–14 (2021)

  37. Saubari, N., Kunfeng, W.: Vision-based floating object detection on water surface: a benchmark of deep learning method. In: AIP Conference Proceedings, vol 2987, no 1. AIP Publishing (2024)

  38. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016)

    Article  Google Scholar 

  39. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y. & Berg, A.C.: Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37. Springer International Publishing (2016)

  40. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229. Springer International Publishing, Cham (2020)

  41. Ultralytics: The code address. https://github.com/ultralytics/ultralytics (2023)

  42. Shinde, S., Kothari, A., Gupta, V.: YOLO based human action recognition and localization. Procedia Comput. Sci. 133, 831–838 (2018)

    Article  Google Scholar 

  43. Zhao, S., Zheng, J., Sun, S., Zhang, L.: An improved YOLO algorithm for fast and accurate underwater object detection. Symmetry. 14, 1669 (2022)

    Article  Google Scholar 

  44. Salman, M.E., Çakar, G.Ç., Azimjonov, J., Kösem, M., Cedimoğlu, H.: Automated prostate cancer grading and diagnosis system using deep learning-based Yolo object detection algorithm. Expert Syst. Appl. 201, 117148 (2022)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Mei Yang wrote the main manuscript text and prepared all figures. Huajun Wang has given a lot of useful tips for this article. All authors reviewed the manuscript.

Corresponding author

Correspondence to Mei Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, M., Wang, H. Real-time water surface target detection based on improved YOLOv7 for Chengdu Sand River. J Real-Time Image Proc 21, 127 (2024). https://doi.org/10.1007/s11554-024-01510-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-024-01510-z

Keywords

Navigation