Automated Full Scene Parsing for Marine ASVs Using Monocular Vision

Qiao, Dalei; Liu, Guangzhong; Li, Wei; Lyu, Taizhi; Zhang, Juan

doi:10.1007/s10846-021-01543-7

Automated Full Scene Parsing for Marine ASVs Using Monocular Vision

Short Paper
Published: 14 February 2022

Volume 104, article number 37, (2022)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Dalei Qiao^1,2,
Guangzhong Liu ORCID: orcid.org/0000-0002-0325-6760¹,
Wei Li³,
Taizhi Lyu² &
…
Juan Zhang²

321 Accesses
7 Citations
Explore all metrics

Abstract

Perceiving and parsing a maritime scene automatically and in real time is a key task for autonomous surface vehicle navigation. We propose a panoptic segmentation framework that allows end-to-end training and multiple task cascading to meet the navigational challenges of scene parsing in a complex maritime environment. In our framework, the feature extraction backbone is based on Res2Net combined with improved FPN. The fusion network neck adds a mask branch to the latest YOLO detector and embeds a bottleneck attention module. We address possible inference conflict between semantic segmentation and instance segmentation with a panoptic fusion head that resolves conflict using Dezert-Smarandache theory. We also constructed the first maritime scene parsing dataset MarPS-1395, which is completely and fully annotated. MarPS-1395 is the first panoptic segmentation dataset in this field. We validated our model on MarPS-1395 as well as the publicly available dataset to investigate the real-time performance and the accuracy of multitask implementation in panoptic segmentation, which included object detection and classification, instance segmentation, and semantic segmentation. The experimental results also show that our method can robustly accomplish full scene parsing in a complex maritime environment, and achieved a good balance between accuracy of segmentation and speed of computing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

BEVSeg: Geometry and Data-Driven Based Multi-view Segmentation in Bird’s-Eye-View

EfficientPS: Efficient Panoptic Segmentation

Article Open access 26 February 2021

Multi-class Segmentation of Trash in Coastal Areas Using Encoder-Decoder Architecture

Data Availability

Not applicable.

References

Wróbel, K., Montewka, J., Kujala, P.: Towards the development of a system-theoretic model for safety assessment of autonomous merchant vessels. Reliab. Eng. Syst. Saf. 178, 209–224 (2018)
Article Google Scholar
Fields, C.: Safety and Ship** 1912–2012: from Titanic to Costa Concordia. Allianz Global Corporate and Speciality AG, Munich (2012)
Google Scholar
Bovcon, B., Muhovič, J., Vranac, D., Mozetič, D., Perš, J. and Kristan, M.: MODS--A USV-oriented object detection and obstacle segmentation benchmark. In: ar**v: 2105.02359 (2021)
Lin, G., Liu, F., Milan, A., Shen, C., Reid, I.: Refinenet: multi-path refinement networks for dense prediction. IEEE Trans. Pattern Anal. Mach. Intell. 42(5), 1228–1242 (2019)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X. and Jia, J.: Pyramid scene parsing network. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2881–2890. IEEE (2017)
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F. and Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proc. of the European conference on computer vision (ECCV), pp. 801–818 (2018)
Bovcon, B., and Kristan, M.: WaSR--A Water Segmentation and Refinement Maritime Obstacle Detection Network. IEEE T. Cybern. 1–14 (2021). https://doi.org/10.1109/TCYB.2021.3085856
Zhang, W., He, X., Li, W., Zhang, Z., Luo, Y., Su, L., Wang, P.: An integrated ship segmentation method based on discriminator and extractor. Image Vis. Comput. 93, 103824 (2020)
Article Google Scholar
Zardoua, Y., Astito, A., Boulaala, M.: A survey on horizon detection algorithms for maritime video surveillance: advances and future techniques. Vis. Comput. 23, 1–21 (2021)
Google Scholar
Chen, X., Liu, Y., Achuthan, K.: WODIS: water obstacle detection network based on image segmentation for autonomous surface vehicles in maritime environments. IEEE Trans. Instrum. Meas. 70, 1–13 (2021)
Google Scholar
Ganbold, U., and Akashi, T.: The Real-Time Reliable Detection of the Horizon Line on High-Resolution Maritime Images for Unmanned Surface-Vehicle. In: 2020 International Conference on Cyberworlds (CW), pp. 204–210. IEEE (2020)
Yao, L., Kanoulas, D., Ji, Z., and Liu, Y.: ShorelineNet: An Efficient Deep Learning Approach for Shoreline Semantic Segmentation for Unmanned Surface Vehicles. In: Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1–7. IEEE (2021)
Jeong, C.Y., Yang, H.S., Moon, K.D.: Horizon detection in maritime images using scene parsing network. Electron. Lett. 54(12), 760–762 (2018)
Article Google Scholar
Qiao, D., Liu, G., Lv, T., Li, W., Zhang, J.: Marine vision-based situational awareness using discriminative deep learning: a survey. J. Mar. Sci. Eng. 9(4), 397 (2021)
Article Google Scholar
Kirillov, A., He, K., Girshick, R., Rother, C., and Dollár, P.: Panoptic segmentation. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 9404–9413. IEEE (2019)
Long, J., Shelhamer, E., and Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 3431–3440. IEEE (2019)
Ronneberger, O., Fischer, P., and Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, pp. 234–241 (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
He, K., Gkioxari, G., Dollár, P., and Girshick, R.: Mask R-CNN. In: Proc. of the IEEE international conference on computer vision (ICCV), pp. 2961–2969. IEEE (2017)
Bolya, D., Zhou, C., **ao, F., and Lee, Y.J.: YOLACT++: Better real-time instance segmentation. In: ar**v: 1912.06218 (2019)
Lee, Y., and Park, J.: CenterMask: Real-time anchor-free instance segmentation. In: Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13906–13915. IEEE (2020)
Wang, X., Zhang, R., Kong, T., Li, L., and Shen, C.: SOLOv2: Dynamic, Faster and Stronger. In: ar**v: 2003.10152 (2020)
Chen, H., Sun, K., Tian, Z., Shen, C., Huang, Y., and Yan, Y.: BlendMask: Top-down meets bottom-up for instance segmentation. In: Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8573–8581. IEEE (2020)
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: a simple and strong anchor-free object detector. IEEE Trans. Pattern Anal. Mach. Intell. 1–1, (2020)
**ong, Y., Liao, R., Zhao, H., Hu, R., Bai, M., Yumer, E. and Urtasun, R.: UPSNet: A unified panoptic segmentation network. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8818–8826. IEEE (2019)
Cheng, B., Collins, M. D., Zhu, Y., Liu, T., Huang, T.S., Adam, H., Chen, L.C.: Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation. In: Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12475–12485. IEEE (2020)
Kirillov, A., Girshick, R., He, K., and Dollár, P.: Panoptic feature pyramid networks. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6399–6408. IEEE (2019)
Gosala, N., and Valada, A.: Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images. In: ar**v: 2108.03227 (2021)
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2Net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2021)
Article Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., and Belongie, S.: Feature pyramid networks for object detection. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2117–2125. IEEE (2017)
Zhu, X., Hu, H., Lin, S., and Dai, J.: Deformable convnets v2: More deformable, better results. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9308–9316. IEEE (2019)
Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollár, P.: Focal loss for dense object detection. In: Proc. of the IEEE international conference on computer vision (ICCV), pp. 2980–2988. IEEE (2017)
Huang, Z., Huang, L., Gong, Y., Huang, C., and Wang, X.: Mask Scoring R-CNN. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 6409–6418. IEEE (2019)
Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M.: YOLOv4: Optimal Speed and Accuracy of Object Detection. In: ar**v: 2004.10934 (2020)
Jaderberg, M., Simonyan, K., and Zisserman, A.: Spatial transformer networks. Advances in neural information processing systems. 28, 2017–2025 (2015)
Hu, J., Shen, L., and Sun, G.: Squeeze-and-Excitation Networks. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 7132–7141. IEEE (2018)
Woo, S., Park, J., Lee, J.Y., and Kweon, I.S.: CBAM: Convolutional block attention module. In: Proc. of the European conference on computer vision (ECCV), pp. 3–19 (2018)
Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: A simple and light-weight attention module for convolutional neural networks. Int. J. Comput. Vis. 128(4), 783–798 (2020)
Article Google Scholar
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. In: Proc. of the AAAI Conference on Artificial Intelligence, pp. 12993–13000 (2020)
De Geus, D., Meletis, P. and Dubbelman, G.: Panoptic segmentation with a joint semantic and instance segmentation network. In: ar**v:1809.02110 (2018)
Dezert, J.: Foundations for a new theory of plausible and paradoxical reasoning. Inf. Secur. 9, 13–57 (2002)
Google Scholar
Shafer G.: A mathematical theory of evidence. Princeton university press, Princeton (1976)
Dezert, J., Tchamova, A., Smarandache, F., and Konstantinova, P.: Target type tracking with PCR5 and Dempster's rules: a comparative analysis. In: 9th International Conference on Information Fusion, pp. 1–8. IEEE (2006)
Dezert, J., Liu, Z.G., and Mercier, G.: Edge detection in color images based on DSmT. In: 14th International Conference on Information Fusion, pp. 1–8. IEEE (2011)
Guo, Y., Sengur, A.: NECM: Neutrosophic evidential c-means clustering algorithm. Neural Comput. & Applic. 26(3), 561–571 (2015)
Article Google Scholar
Martin, A., Osswald, C.: A new generalization of the proportional conflict redistribution rule stable in terms of decision. Advances and Applications of DSmT for Information Fusion: Collected Works. 2(2), 69–88 (2006)
Google Scholar
Daniel, M.: Classical combination rules generalized to DSm hyper-power sets and their comparison with the hybrid DSm rule. Advances and Applications of DSmT for Information Fusion: Collected Works Volume. 2(2), 89–112 (2006)
Google Scholar
Prasad, D.K., Rajan, D., Rachmawati, L., Rajabally, E., Quek, C.: Video processing from electro-optical sensors for object detection and tracking in a maritime environment: a survey. IEEE Trans. Intell. Transp. Syst. 18(8), 1993–2016 (2017)
Article Google Scholar
Qiao, D., Liu, G., Dong, F., Jiang, S.X., Dai, L.: Marine vessel re-identification: a large-scale dataset and global-and-local fusion-based discriminative feature learning. IEEE Access. 8, 27744–27756 (2020)
Article Google Scholar
Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Lin, H., Zhang, Z., Sun, Y., He, T., Mueller, J., Manmatha, R., and Li, M.: ResNeSt: Split-attention networks. In: ar**v: 2004.08955 (2020)
Howish. https://github.com/howish/PyDSmT, Github (2019) (Accessed on 10 November, 2021)
Zhao, H., Zhang, Y., Liu, S., Shi, J., Loy, C.C., Lin, D., and Jia, J.: PSANet: Point-wise spatial attention network for scene parsing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 267–283 (2018)
Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., Liu, W.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2020)
Article Google Scholar
Li, Y., Chen, X., Zhu, Z., ** big data application of JMI under Grant KJCX1809.

Author information

Authors and Affiliations

College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China
Dalei Qiao & Guangzhong Liu
College of Information Engineering, Jiangsu Maritime Institute, Nan**g, 211100, China
Dalei Qiao, Taizhi Lyu & Juan Zhang
Nan**g Marine Radar Institute, Nan**g, 211100, China
Wei Li

Authors

Dalei Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Guangzhong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Li
View author publications
You can also search for this author in PubMed Google Scholar
Taizhi Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Juan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Dalei Qiao: investigation, methodology, formal analysis, validation, resources, writing - original draft. Guangzhong Liu: visualization, writing - review & editing. Wei Li: validation-review & editing. Taizhi Lyu: dataset creation, writing - review & editing. Juan Zhang: data processing, funding acquisition.

Corresponding author

Correspondence to Guangzhong Liu.

Ethics declarations

Ethical Approval

This is an experimental study of unmanned surface vessel. The Jiangsu Maritime Institute Research Ethics Committee has confirmed that no ethical approval is required.

Consent to Participate

This is an experimental study of unmanned surface vessel. We confirm that no human related experiments were involved in this study.

Consent to Publish

We confirm that the work described has not been published before and all authors have read and agreed to the published version of the manuscript.

Competing Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qiao, D., Liu, G., Li, W. et al. Automated Full Scene Parsing for Marine ASVs Using Monocular Vision. J Intell Robot Syst 104, 37 (2022). https://doi.org/10.1007/s10846-021-01543-7

Download citation

Received: 30 November 2020
Accepted: 27 November 2021
Published: 14 February 2022
DOI: https://doi.org/10.1007/s10846-021-01543-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Automated Full Scene Parsing for Marine ASVs Using Monocular Vision

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

BEVSeg: Geometry and Data-Driven Based Multi-view Segmentation in Bird’s-Eye-View

EfficientPS: Efficient Panoptic Segmentation

Multi-class Segmentation of Trash in Coastal Areas Using Encoder-Decoder Architecture

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent to Publish

Competing Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Automated Full Scene Parsing for Marine ASVs Using Monocular Vision

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

BEVSeg: Geometry and Data-Driven Based Multi-view Segmentation in Bird’s-Eye-View

EfficientPS: Efficient Panoptic Segmentation

Multi-class Segmentation of Trash in Coastal Areas Using Encoder-Decoder Architecture

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent to Publish

Competing Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation