Lightweight underwater object detection based on image enhancement and multi-attention

Tian, Tian; Cheng, Jixiang; Wu, Dan; Li, Zhidan

doi:10.1007/s11042-023-18008-8

Lightweight underwater object detection based on image enhancement and multi-attention

Published: 10 January 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Tian Tian¹,
Jixiang Cheng¹,
Dan Wu¹ &
…
Zhidan Li¹

231 Accesses
1 Citation
Explore all metrics

Abstract

Underwater object detection, which is crucial to the exploration and exploitation of marine resources, remains a challenge because noisy, weak contrast, and color distorted images are provided as sources of supervision. To address the issues of low detection accuracy caused by imprecise images, and inefficiency due to huge amount of parameters in most deep neural networks, this paper proposed a novel lightweight deep learning model with image enhancement and multi-attention. First, image enchancement algorithm MSRCR is applied to enhance image quality in order to improve the training effect of deep learning model. Then, YOLOX is used as baseline model and GhostNet is utilized as backbone network in order to reduce computation budget. Finally, a multi-attention module LCR considering level, channel and spatial domains is divised and integrated into the feature pyramid network to enhance feature learning ability and detection accuracy. Experimental result shows that on the considered datasets our model achieves an mAP of 77.32\(\%\) and a size of 18.5MB, 1.25\(\%\) higher and 46.4\(\%\) less than the values of baseline network, indicating that our method achieves a superior detection precison while kee** model lighweight.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning models for digital image processing: a review

Article 07 January 2024

CBAM: Convolutional Block Attention Module

Methods for image denoising using convolutional neural network: a review

Article Open access 10 June 2021

Data Availability

All evaluation datasets in our experiments are public datasets and they are available online.

References

Abdullah-Al-Wadud M, Kabir MH, Akber Dewan MA et al (2007) A dynamic histogram equalization for image contrast enhancement. IEEE Trans Consum Electron 53(2):593–600. https://doi.org/10.1109/TCE.2007.381734
Article Google Scholar
Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. ar**v:2004.10934
Chen L, Liu ZH, Tong L et al (2020) Underwater object detection using invert multi-class adaboost with deep learning. In: International joint conference on neural networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN48605.2020.9207506
Dai X, Chen Y, **ao B et al (2021) Dynamic head: unifying object detection heads with attentions. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 7369–7378. https://doi.org/10.1109/CVPR46437.2021.00729
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 886–893. https://doi.org/10.1109/CVPR.2005.177
Ell TA, Sangwine SJ (2007) Hypercomplex fourier transforms of color images. IEEE Trans Image Process 16(1):22–35. https://doi.org/10.1109/TIP.2006.884955
Article MathSciNet Google Scholar
Fayaz S, Shabir AParah, Qureshi G (2022) Underwater object detection: architectures and algorithms – a comprehensive review. Multimed Tools Appl 81:20,871–20,916. https://doi.org/10.1007/s11042-022-12502-1
Felzenszwalb PF, Girshick RB, McAllester D et al (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645. https://doi.org/10.1109/TPAMI.2009.167
Article Google Scholar
Fu X, Zhuang P, Huang Y et al (2014) A retinex-based enhancing approach for single underwater image. In: IEEE International conference on image processing (ICIP), pp 4572–4576. https://doi.org/10.1109/ICIP.2014.7025927
Ge Z, Liu ST, Wang F et al (2021) Yolox: exceeding yolo series in 2021. ar**v:2107.08430
Girshick R (2012) From rigid templates to grammars: object detection with structured models. PhD thesis, USA
Girshick R (2015) Fast r-cnn. In: IEEE International conference on computer vision (ICCV), pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169
Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 580–587. https://doi.org/10.1109/CVPR.2014.81
Han K, Wang YH, Tian Q et al (2020) Ghostnet: more features from cheap operations. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 1577–1586. https://doi.org/10.1109/CVPR42600.2020.00165
He K, Zhang X, Ren S et al (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824
Article Google Scholar
Holt B, Jones C (2017) Detection of marine slicks with sar: scientific and experimental legacy of werner alpers, his students and colleagues. In: 2017 IEEE International geoscience and remote sensing symposium (IGARSS), pp 1480–1483. https://doi.org/10.1109/IGARSS.2017.8127247
Howard A, Sandler M, Chen B et al (2019) Searching for mobilenetv3. In: IEEE/CVF International conference on computer vision (ICCV), pp 1314–1324. https://doi.org/10.1109/ICCV.2019.00140
Howard AG, Zhu M, Chen B et al (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. ar**v:1704.04861
Hu J, Shen L, Albanie S et al (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Article Google Scholar
Kaur J, Williamjeet S (2023) A systematic review of object detection from images using deep learning. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-023-15981-y
Article Google Scholar
Khan R, Yang Y, Liu Q et al (2021) Deep image enhancement for ill light imaging. Journal of the Optical Society of America A, pp 827–839. https://doi.org/10.1364/josaa.410316
Li CY, Guo CL, Ren WQ et al (2020) An underwater image enhancement benchmark dataset and beyond. IEEE Trans Image Process 29:4376–4389. https://doi.org/10.1109/TIP.2019.2955241
Article Google Scholar
Li J, Pan Z, Liu Q et al (2022) Complementarity-aware attention network for salient object detection. IEEE Transactions on Cybernetics 52(2):873–886. https://doi.org/10.1109/TCYB.2020.2988093
Article Google Scholar
Li X, Lv CQ, Wang WH et al (2022) Generalized focal loss: towards efficient representation learning for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 1–14. https://doi.org/10.1109/TPAMI.2022.3180392
Lin J, Miao ZJ (2016) Research on the illumination robust of target recognition. In: IEEE International conference on signal processing (ICSP), pp 811–814. https://doi.org/10.1109/ICSP.2016.7877943
Lin WH, Zhong JX, Liu S et al (2020) Roimix: proposal-fusion among multiple images for underwater object detection. In: IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 2588–2592. https://doi.org/10.1109/ICASSP40776.2020.9053829
Liu S, Qi L, Qin H et al (2018) Path aggregation network for instance segmentation. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 8759–8768. https://doi.org/10.1109/CVPR.2018.00913
Liu W, Dragomir A, Dumitru E et al (2016) Ssd: single shot multibox detector. In: European conference on computer vision (ECCV), pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Ma NN, Zhang XY, Zheng HT (2018) Shufflenet v2: practical guidelines for efficient cnn architecture design. In: European conference on computer vision (ECCV), pp 122–138. https://doi.org/10.1007/978-3-030-01264-9_8
Miloslavich P, Seeyave S, Muller-Karger F et al (2019) Challenges for global ocean observation: the need for increased human capacity. Journal of Operational Oceanography 12(sup2):S137–S156. https://doi.org/10.1080/1755876X.2018.1526463
Article Google Scholar
Moroni D, Pieri G, Salvetti O et al (2015) Proactive marine information system for environmental monitoring. In: OCEANS 2015 - Genova, pp 1–5. https://doi.org/10.1109/OCEANS-Genova.2015.7271533
Nascimento T, Gama S (2017) Fisheye: marine species’ recognition and visualization. In: 2017 24\(^{\circ }\) Encontro Português de Computação Gráfica e Interação (EPCGI), pp 1–8. https://doi.org/10.1109/EPCGI.2017.8124307
Parthasarathy S, Sankaran P (2012) An automated multi scale retinex with color restoration for image enhancement. In: National conference on communications (NCC), pp 1–5. https://doi.org/10.1109/NCC.2012.6176791
Rahman Z, Jobson D, Woodell G (1996) Multi-scale retinex for color image enhancement. In: IEEE International conference on image processing (ICIP), pp 1003–1006. https://doi.org/10.1109/ICIP.1996.560995
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 6517–6525. https://doi.org/10.1109/CVPR.2017.690
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. ar**v:1804.02767
Redmon J, Divvala S, Girshick R et al (2016) You only look once: unified, real-time object detection. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 779–788. https://doi.org/10.1109/CVPR.2016.91
Ren S, He K, Girshick R et al (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
Sandler M, Howard A, Zhu M et al (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 4510–4520. https://doi.org/10.1109/CVPR.2018.00474
Shen X, Sun X, Wang H et al (2023) Multi-dimensional, multi-functional and multi-level attention in yolo for underwater object detection. Neural Computing and Applications 35(27):19,935-19,960. https://doi.org/10.1007/s00521-023-08781-w
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ar**v:1409.1556
Viola P, Jones M (2004) Robust real-time face detection. In: International journal of computer vision (IJCV), pp 137–154
Wang Y, Song W, Fortino G et al (2019) An experimental-based review of image enhancement and image restoration methods for underwater imaging. IEEE Access 7:140,233-140,251. https://doi.org/10.1109/ACCESS.2019.2932130
Woo SY, Park J, Lee JY et al (2018) Cbam: convolutional block attention module. In: European conference on computer vision (ECCV), pp 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
Xu C, Wang H, Liu X et al (2023) Bi-attention network for bi-directional salient object detection. Appl Intell. https://doi.org/10.1007/s10489-023-04648-8
Article Google Scholar
Xu XJ, Wang YR, Yang GS et al (2016) Image enhancement method based on fractional wavelet transform. In: IEEE International conference on signal and image processing (ICSIP), pp 194–197. https://doi.org/10.1109/SIPROCESS.2016.7888251
Yang A, Liu Y, Cheng S et al (2023) Spatial attention-guided deformable fusion network for salient object detection. Multimedia Syst. https://doi.org/10.1007/s00530-023-01152-4
Article Google Scholar
Yeh CH, Lin CH, Kang LW et al (2021) Lightweight deep neural network for joint learning of underwater object detection and color conversion. IEEE Transactions on Neural Networks and Learning Systems, pp 1–15. https://doi.org/10.1109/TNNLS.2021.3072414
Zhang XY, Zhou XY, Lin MX et al (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 6848–6856. https://doi.org/10.1109/CVPR.2018.00716
Zhou Y, Chen SC, Wang YM et al (2020) Review of research on lightweight convolutional neural networks. In: IEEE Information technology and mechatronics engineering conference (ITOEC), pp 1713–1720. https://doi.org/10.1109/ITOEC49072.2020.9141847
Zhu X, Lyu S, Wang X et al (2021) Tph-yolov5: improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: IEEE/CVF International conference on computer vision workshops (ICCVW), pp 2778–2788. https://doi.org/10.1109/ICCVW54120.2021.00312
Zou Z, Chen K, Shi Z et al (2023) Object detection in 20 years: a survey. Proc IEEE 111(3):257–276. https://doi.org/10.1109/JPROC.2023.3238524
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, 610500, China
Tian Tian, Jixiang Cheng, Dan Wu & Zhidan Li

Authors

Tian Tian
View author publications
You can also search for this author in PubMed Google Scholar
Jixiang Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Dan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhidan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jixiang Cheng.

Ethics declarations

Competing Interests

The authors declare that they have no relevant financial or nonfinancial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tian, T., Cheng, J., Wu, D. et al. Lightweight underwater object detection based on image enhancement and multi-attention. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-023-18008-8

Download citation

Received: 19 October 2022
Revised: 13 September 2023
Accepted: 25 December 2023
Published: 10 January 2024
DOI: https://doi.org/10.1007/s11042-023-18008-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight underwater object detection based on image enhancement and multi-attention

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep learning models for digital image processing: a review

CBAM: Convolutional Block Attention Module

Methods for image denoising using convolutional neural network: a review

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Lightweight underwater object detection based on image enhancement and multi-attention

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep learning models for digital image processing: a review

CBAM: Convolutional Block Attention Module

Methods for image denoising using convolutional neural network: a review

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation