Log in

Automated Full Scene Parsing for Marine ASVs Using Monocular Vision

  • Short Paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Perceiving and parsing a maritime scene automatically and in real time is a key task for autonomous surface vehicle navigation. We propose a panoptic segmentation framework that allows end-to-end training and multiple task cascading to meet the navigational challenges of scene parsing in a complex maritime environment. In our framework, the feature extraction backbone is based on Res2Net combined with improved FPN. The fusion network neck adds a mask branch to the latest YOLO detector and embeds a bottleneck attention module. We address possible inference conflict between semantic segmentation and instance segmentation with a panoptic fusion head that resolves conflict using Dezert-Smarandache theory. We also constructed the first maritime scene parsing dataset MarPS-1395, which is completely and fully annotated. MarPS-1395 is the first panoptic segmentation dataset in this field. We validated our model on MarPS-1395 as well as the publicly available dataset to investigate the real-time performance and the accuracy of multitask implementation in panoptic segmentation, which included object detection and classification, instance segmentation, and semantic segmentation. The experimental results also show that our method can robustly accomplish full scene parsing in a complex maritime environment, and achieved a good balance between accuracy of segmentation and speed of computing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

Not applicable.

References

  1. Wróbel, K., Montewka, J., Kujala, P.: Towards the development of a system-theoretic model for safety assessment of autonomous merchant vessels. Reliab. Eng. Syst. Saf. 178, 209–224 (2018)

    Article  Google Scholar 

  2. Fields, C.: Safety and Ship** 1912–2012: from Titanic to Costa Concordia. Allianz Global Corporate and Speciality AG, Munich (2012)

    Google Scholar 

  3. Bovcon, B., Muhovič, J., Vranac, D., Mozetič, D., Perš, J. and Kristan, M.: MODS--A USV-oriented object detection and obstacle segmentation benchmark. In: ar**v: 2105.02359 (2021)

  4. Lin, G., Liu, F., Milan, A., Shen, C., Reid, I.: Refinenet: multi-path refinement networks for dense prediction. IEEE Trans. Pattern Anal. Mach. Intell. 42(5), 1228–1242 (2019)

    Google Scholar 

  5. Zhao, H., Shi, J., Qi, X., Wang, X. and Jia, J.: Pyramid scene parsing network. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2881–2890. IEEE (2017)

  6. Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F. and Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proc. of the European conference on computer vision (ECCV), pp. 801–818 (2018)

  7. Bovcon, B., and Kristan, M.: WaSR--A Water Segmentation and Refinement Maritime Obstacle Detection Network. IEEE T. Cybern. 1–14 (2021). https://doi.org/10.1109/TCYB.2021.3085856

  8. Zhang, W., He, X., Li, W., Zhang, Z., Luo, Y., Su, L., Wang, P.: An integrated ship segmentation method based on discriminator and extractor. Image Vis. Comput. 93, 103824 (2020)

    Article  Google Scholar 

  9. Zardoua, Y., Astito, A., Boulaala, M.: A survey on horizon detection algorithms for maritime video surveillance: advances and future techniques. Vis. Comput. 23, 1–21 (2021)

    Google Scholar 

  10. Chen, X., Liu, Y., Achuthan, K.: WODIS: water obstacle detection network based on image segmentation for autonomous surface vehicles in maritime environments. IEEE Trans. Instrum. Meas. 70, 1–13 (2021)

    Google Scholar 

  11. Ganbold, U., and Akashi, T.: The Real-Time Reliable Detection of the Horizon Line on High-Resolution Maritime Images for Unmanned Surface-Vehicle. In: 2020 International Conference on Cyberworlds (CW), pp. 204–210. IEEE (2020)

  12. Yao, L., Kanoulas, D., Ji, Z., and Liu, Y.: ShorelineNet: An Efficient Deep Learning Approach for Shoreline Semantic Segmentation for Unmanned Surface Vehicles. In: Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1–7. IEEE (2021)

  13. Jeong, C.Y., Yang, H.S., Moon, K.D.: Horizon detection in maritime images using scene parsing network. Electron. Lett. 54(12), 760–762 (2018)

    Article  Google Scholar 

  14. Qiao, D., Liu, G., Lv, T., Li, W., Zhang, J.: Marine vision-based situational awareness using discriminative deep learning: a survey. J. Mar. Sci. Eng. 9(4), 397 (2021)

    Article  Google Scholar 

  15. Kirillov, A., He, K., Girshick, R., Rother, C., and Dollár, P.: Panoptic segmentation. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 9404–9413. IEEE (2019)

  16. Long, J., Shelhamer, E., and Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 3431–3440. IEEE (2019)

  17. Ronneberger, O., Fischer, P., and Brox, T.: U-Net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, pp. 234–241 (2015)

  18. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  19. He, K., Gkioxari, G., Dollár, P., and Girshick, R.: Mask R-CNN. In: Proc. of the IEEE international conference on computer vision (ICCV), pp. 2961–2969. IEEE (2017)

  20. Bolya, D., Zhou, C., **ao, F., and Lee, Y.J.: YOLACT++: Better real-time instance segmentation. In: ar**v: 1912.06218 (2019)

  21. Lee, Y., and Park, J.: CenterMask: Real-time anchor-free instance segmentation. In: Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13906–13915. IEEE (2020)

  22. Wang, X., Zhang, R., Kong, T., Li, L., and Shen, C.: SOLOv2: Dynamic, Faster and Stronger. In: ar**v: 2003.10152 (2020)

  23. Chen, H., Sun, K., Tian, Z., Shen, C., Huang, Y., and Yan, Y.: BlendMask: Top-down meets bottom-up for instance segmentation. In: Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8573–8581. IEEE (2020)

  24. Tian, Z., Shen, C., Chen, H., He, T.: FCOS: a simple and strong anchor-free object detector. IEEE Trans. Pattern Anal. Mach. Intell. 1–1, (2020)

  25. **ong, Y., Liao, R., Zhao, H., Hu, R., Bai, M., Yumer, E. and Urtasun, R.: UPSNet: A unified panoptic segmentation network. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8818–8826. IEEE (2019)

  26. Cheng, B., Collins, M. D., Zhu, Y., Liu, T., Huang, T.S., Adam, H., Chen, L.C.: Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation. In: Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12475–12485. IEEE (2020)

  27. Kirillov, A., Girshick, R., He, K., and Dollár, P.: Panoptic feature pyramid networks. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6399–6408. IEEE (2019)

  28. Gosala, N., and Valada, A.: Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images. In: ar**v: 2108.03227 (2021)

  29. Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2Net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2021)

    Article  Google Scholar 

  30. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., and Belongie, S.: Feature pyramid networks for object detection. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 2117–2125. IEEE (2017)

  31. Zhu, X., Hu, H., Lin, S., and Dai, J.: Deformable convnets v2: More deformable, better results. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9308–9316. IEEE (2019)

  32. Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollár, P.: Focal loss for dense object detection. In: Proc. of the IEEE international conference on computer vision (ICCV), pp. 2980–2988. IEEE (2017)

  33. Huang, Z., Huang, L., Gong, Y., Huang, C., and Wang, X.: Mask Scoring R-CNN. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 6409–6418. IEEE (2019)

  34. Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M.: YOLOv4: Optimal Speed and Accuracy of Object Detection. In: ar**v: 2004.10934 (2020)

  35. Jaderberg, M., Simonyan, K., and Zisserman, A.: Spatial transformer networks. Advances in neural information processing systems. 28, 2017–2025 (2015)

  36. Hu, J., Shen, L., and Sun, G.: Squeeze-and-Excitation Networks. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 7132–7141. IEEE (2018)

  37. Woo, S., Park, J., Lee, J.Y., and Kweon, I.S.: CBAM: Convolutional block attention module. In: Proc. of the European conference on computer vision (ECCV), pp. 3–19 (2018)

  38. Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: A simple and light-weight attention module for convolutional neural networks. Int. J. Comput. Vis. 128(4), 783–798 (2020)

    Article  Google Scholar 

  39. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. In: Proc. of the AAAI Conference on Artificial Intelligence, pp. 12993–13000 (2020)

  40. De Geus, D., Meletis, P. and Dubbelman, G.: Panoptic segmentation with a joint semantic and instance segmentation network. In: ar**v:1809.02110 (2018)

  41. Dezert, J.: Foundations for a new theory of plausible and paradoxical reasoning. Inf. Secur. 9, 13–57 (2002)

    Google Scholar 

  42. Shafer G.: A mathematical theory of evidence. Princeton university press, Princeton (1976)

  43. Dezert, J., Tchamova, A., Smarandache, F., and Konstantinova, P.: Target type tracking with PCR5 and Dempster's rules: a comparative analysis. In: 9th International Conference on Information Fusion, pp. 1–8. IEEE (2006)

  44. Dezert, J., Liu, Z.G., and Mercier, G.: Edge detection in color images based on DSmT. In: 14th International Conference on Information Fusion, pp. 1–8. IEEE (2011)

  45. Guo, Y., Sengur, A.: NECM: Neutrosophic evidential c-means clustering algorithm. Neural Comput. & Applic. 26(3), 561–571 (2015)

    Article  Google Scholar 

  46. Martin, A., Osswald, C.: A new generalization of the proportional conflict redistribution rule stable in terms of decision. Advances and Applications of DSmT for Information Fusion: Collected Works. 2(2), 69–88 (2006)

    Google Scholar 

  47. Daniel, M.: Classical combination rules generalized to DSm hyper-power sets and their comparison with the hybrid DSm rule. Advances and Applications of DSmT for Information Fusion: Collected Works Volume. 2(2), 89–112 (2006)

    Google Scholar 

  48. Prasad, D.K., Rajan, D., Rachmawati, L., Rajabally, E., Quek, C.: Video processing from electro-optical sensors for object detection and tracking in a maritime environment: a survey. IEEE Trans. Intell. Transp. Syst. 18(8), 1993–2016 (2017)

    Article  Google Scholar 

  49. Qiao, D., Liu, G., Dong, F., Jiang, S.X., Dai, L.: Marine vessel re-identification: a large-scale dataset and global-and-local fusion-based discriminative feature learning. IEEE Access. 8, 27744–27756 (2020)

    Article  Google Scholar 

  50. Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Lin, H., Zhang, Z., Sun, Y., He, T., Mueller, J., Manmatha, R., and Li, M.: ResNeSt: Split-attention networks. In: ar**v: 2004.08955 (2020)

  51. Howish. https://github.com/howish/PyDSmT, Github (2019) (Accessed on 10 November, 2021)

  52. Zhao, H., Zhang, Y., Liu, S., Shi, J., Loy, C.C., Lin, D., and Jia, J.: PSANet: Point-wise spatial attention network for scene parsing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 267–283 (2018)

  53. Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., Liu, W.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2020)

    Article  Google Scholar 

  54. Li, Y., Chen, X., Zhu, Z., ** big data application of JMI under Grant KJCX1809.

Author information

Authors and Affiliations

Authors

Contributions

Dalei Qiao: investigation, methodology, formal analysis, validation, resources, writing - original draft. Guangzhong Liu: visualization, writing - review & editing. Wei Li: validation-review & editing. Taizhi Lyu: dataset creation, writing - review & editing. Juan Zhang: data processing, funding acquisition.

Corresponding author

Correspondence to Guangzhong Liu.

Ethics declarations

Ethical Approval

This is an experimental study of unmanned surface vessel. The Jiangsu Maritime Institute Research Ethics Committee has confirmed that no ethical approval is required.

Consent to Participate

This is an experimental study of unmanned surface vessel. We confirm that no human related experiments were involved in this study.

Consent to Publish

We confirm that the work described has not been published before and all authors have read and agreed to the published version of the manuscript.

Competing Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qiao, D., Liu, G., Li, W. et al. Automated Full Scene Parsing for Marine ASVs Using Monocular Vision. J Intell Robot Syst 104, 37 (2022). https://doi.org/10.1007/s10846-021-01543-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-021-01543-7

Keywords

Navigation