Abstract
It has been a challenge to obtain accurate detection results in a timely manner when faced with complex and changing surface target detection. Detecting targets on water surfaces in real-time can be challenging due to their rapid movement, small size, and fragmented appearance. In addition, traditional detection methods are often labor-intensive and time-consuming, especially when dealing with large water bodies such as rivers and lakes. This paper presents an improved water surface target detection algorithm that is based on the YOLOv7 (you only look once) model to enhance the performance of water surface target detection. We have enhanced the accuracy and speed of detecting surface targets by making improvements to three key structures: the network aggregation structure, the pyramid pooling structure, and the down-sampling structure. Furthermore, we implemented the model on mobile devices and designed a detection software. The software enables real-time detection through images and videos. The experimental results demonstrate that the improved model outperforms the original YOLOv7 model. It exhibits a 6.4% boost in accuracy, a 4.2% improvement in recall, a 4.1% increase in mAP, a 14.3% reduction in parameter counts, and archives the FPS of 87. The software has the ability to accurately recognize 11 typical targets on the water surface and demonstrates excellent water surface target detection capability.
Data availability
No datasets were generated or analysed during the current study.
References
Chen, W., Li, L., Xu, L.: Changes in Sand River open the "gate" of Chengdu's river ecological improvement. Chengdu Daily, 2006-12-04(A06) (2006)
Chen, Y.: The Green Great Wall of Chengdu-New Sand River. Disast. Prev. Expo. 5, 19–20 (2005)
Liang, S., Yang, X.: Analyzing the design of urban waterfront ecological barge—taking the transformation of Chengdu Sand River water system as an example. Mod. Hortic. 8, 86 (2014)
Dudgeon, D., Arthington, A.H., Gessner, M.O., Kawabata, Z.I., Knowler, D.J., Lévêque, C., Naiman, R.J., Prieur-Richard, A.H., Soto, D., Stiassny, M.L., Sullivan, C.A.: Freshwater biodiversity: importance, threats, status and conservation challenges. Biol. Rev. 81, 163–182 (2006)
Abbe, E., Sandon, C.: On the universality of deep learning. Adv. Neural. Inf. Process. Syst. 33, 20061–20072 (2020)
Silva, S.H., Najafirad, P.: Opportunities and challenges in deep learning adversarial robustness: a survey. ar**v:2007.00753 (2020)
Yang, Y.Y., Rashtchian, C., Zhang, H., Salakhutdinov, R.R., Chaudhuri, K.: A closer look at accuracy vs. robustness. Adv. Neural. Inf. Process. Syst. 33, 8588–8601 (2020)
Ju, M., Luo, H., Wang, Z., Hui, B., Chang, Z.: The application of improved YOLO V3 in multi-scale target detection. Appl. Sci. 9, 3775 (2019)
Xu, Q., Lin, R., Yue, H., Huang, H., Yang, Y., Yao, Z.: Research on small target detection in driving scenarios based on improved yolo network. IEEE Access 8, 27574–27583 (2020)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)
Joseph, E.C., Bamisile, O., Ugochi, N., Zhen, Q., Ilakoze, N., Ijeoma, C.: Systematic advancement of YOLO object detector for real-time detection of objects. In 2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 279–284 (2021)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. ar**v:1804.02767 (2018)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. ar**v:2004.10934 (2020)
Jocher, J.: YOLOv5 release v6.1. https://github.com/ultralytics/yolov5/releases/tag/v6.1/ (2022). Accessed 22 Feb 2022
Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W., Li, Y.: YOLOv6: a single-stage object detection framework for industrial applications. ar**v:2209.02976 (2022)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. ar**v:2207.02696 (2022)
Yu, Y., Zhao, J., Gong, Q., Huang, C., Zheng, G., Ma, J.: Real-time underwater maritime object detection in side-scan sonar images based on transformer-YOLOv5. Remote Sens. 13, 3555 (2021)
Li, X., Tian, M., Kong, S., Wu, L., Yu, J.: A modified YOLOv3 detection method for vision-based water surface garbage capture robot. Int. J. Adv. Rob. Syst. 17, 1729881420932715 (2020)
Fraser, S., Nikora, V., Williamson, B.J., Scott, B.E.: Automatic active acoustic target detection in turbulent aquatic environments. Limnol. Oceanogr. Methods 15, 184–199 (2017)
Li, R., Wu, J., Cao, L.: Ship target detection of unmanned surface vehicle base on efficientdet. Syst. Sci. Control Eng. 10, 264–271 (2022)
Yuan, X., Guo, L., Luo, C., Zhou, X., Yu, C.: A survey of target detection and recognition methods in underwater turbid areas. Appl. Sci. 12, 4898 (2022)
Mohamed, H.E.D., Fadl, A., Anas, O., Wageeh, Y., ElMasry, N., Nabil, A., Atia, A.: Msr-yolo: method to enhance fish detection and tracking in fish farms. Procedia Comput. Sci. 170, 539–546 (2020)
Ma, Z., Zeng, Y., Wu, L., Zhang, L., Li, J., Li, H.: Water surface targets recognition and tracking based on improved YOLO and KCF algorithms. In: 2021 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1460–1465 (2021)
Yan, J., Zhou, Z., Zhou, D., Su, B., Xuanyuan, Z., Tang, J., Lai, Y., Chen, J., Liang, W.: Underwater object detection algorithm based on attention mechanism and cross-stage partial fast spatial pyramidal pooling. Front. Mar. Sci. 9, 1056300 (2022)
Yang, Y., Chen, L., Zhang, J., Long, L., Wang, Z.: UGC-YOLO: underwater environment object detection based on YOLO with a global context block. J. Ocean Univ. China 22, 665–674 (2023)
Cheng, L., Deng, B., Yang, Y., Lyu, J., Zhao, J., Zhou, K., Yang, C., Wang, L., Yang, S., He, Y.: Water target recognition method and application for unmanned surface vessels. IEEE Access 10, 421–434 (2021)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. ar**v:1706.03762 (2017)
**ao, X., Zhang, D., Hu, G., Jiang, Y., **a, S.: CNN–MHSA: a convolutional neural network and multi-head self-attention combined approach for detecting phishing websites. Neural Netw. 125, 303–312 (2020)
Li, P., Zheng, J., Li, P., Long, H., Li, M., Gao, L.: Tomato maturity detection and counting model based on MHSA-YOLOv8. Sensors. 23, 6701 (2023)
Sunkara, R., Luo, T.: No more strided convolutions or pooling: a new CNN building block for low-resolution images and small objects. ar**v:2208.03641 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1904–1916 (2015)
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-yolov4: scaling cross stage partial network. In: Proceedings of the IEEE/cvf Conference on Computer Vision and Pattern Recognition, pp. 13029–13038 (2021)
**, T., Bercea, G.T., Le, T.D., Chen, T., Su, G., Imai, H., Negishi, Y., Leu, A., O'Brien, K., Kawachiya, K., Eichenberger, A.E.: Compiling onnx neural network models using mlir. ar**v:2008.08272 (2020)
Li, J., Xu, Y., Li, Y., Qi, K., Yu, F., Sun, S.: Research on intelligent recognition solution of tobacco disease on android platform. In: 2022 International Conference on Automation, Robotics and Computer Engineering (ICARCE), pp. 1–4 (2022)
Guo, Y., Lu, Y., Guo, Y., Liu, R.W., Chui, K.T.: Intelligent vision-enabled detection of water-surface targets for video surveillance in maritime transportation. J. Adv. Transp. 1–14 (2021)
Saubari, N., Kunfeng, W.: Vision-based floating object detection on water surface: a benchmark of deep learning method. In: AIP Conference Proceedings, vol 2987, no 1. AIP Publishing (2024)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y. & Berg, A.C.: Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37. Springer International Publishing (2016)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229. Springer International Publishing, Cham (2020)
Ultralytics: The code address. https://github.com/ultralytics/ultralytics (2023)
Shinde, S., Kothari, A., Gupta, V.: YOLO based human action recognition and localization. Procedia Comput. Sci. 133, 831–838 (2018)
Zhao, S., Zheng, J., Sun, S., Zhang, L.: An improved YOLO algorithm for fast and accurate underwater object detection. Symmetry. 14, 1669 (2022)
Salman, M.E., Çakar, G.Ç., Azimjonov, J., Kösem, M., Cedimoğlu, H.: Automated prostate cancer grading and diagnosis system using deep learning-based Yolo object detection algorithm. Expert Syst. Appl. 201, 117148 (2022)
Author information
Authors and Affiliations
Contributions
Mei Yang wrote the main manuscript text and prepared all figures. Huajun Wang has given a lot of useful tips for this article. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, M., Wang, H. Real-time water surface target detection based on improved YOLOv7 for Chengdu Sand River. J Real-Time Image Proc 21, 127 (2024). https://doi.org/10.1007/s11554-024-01510-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11554-024-01510-z