An efficient weakly semi-supervised method for object automated annotation

Wang, **ngzheng; Wei, Guoyao; Chen, Songwei; Liu, Jiehao

doi:10.1007/s11042-023-15305-0

An efficient weakly semi-supervised method for object automated annotation

Published: 17 June 2023

Volume 83, pages 9417–9440, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

**ngzheng Wang ORCID: orcid.org/0000-0002-5433-3631¹,
Guoyao Wei¹,
Songwei Chen¹ &
…
Jiehao Liu¹

190 Accesses
1 Altmetric
Explore all metrics

Abstract

Object annotation is essential for computer vision tasks, and more high-quality annotated data can effectively improve the performance of vision models. However, manual annotation is time-consuming (annotating a box takes 35s). Recent studies have explored faster automated annotation, among which weakly supervised methods stand out. Weakly supervised methods learn to automatically localize objects in images from weakly labeled annotations, e.g., class tags or points, replacing manual bounding box annotations. Although using a single weakly labeled annotation can reduce a large amount of time, it leads to poor annotation quality, particularly for the complex scenes containing multiple objects. To balance annotation time and quality, we propose a weakly semi-supervised automated annotation method. Its main idea is to incorporate point-labeled and fully labeled annotations into a teacher-student framework for training, to jointly localize the object bounding boxes on all point-labeled images. We also propose two effective techniques within this framework to better use of these mixed annotations. The first is a point-guided sample assignment technique which optimizes the loss calculation. The second is a pseudo-label filtering technique which generate accurate pseudo labels for model training by utilizing the points and boxes localization confidences. Extensive experiments on MSCOCO demonstrate that our method outperforms existing automated annotation methods. In particular, when using 95% point-labeled and 5% fully labeled data, our approach reduces the annotation time by approximately 52% and achieves an annotation quality of 87.4% mIoU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

C-WSL: Count-Guided Weakly Supervised Localization

Many-Shot from Low-Shot: Learning to Annotate Using Mixed Supervision for Object Detection

Learning Semantic Correspondence with Sparse Annotations

Data Availability Statement

The datasets generated and analysed during the current study are not publicly available due to the excessive size of MSCOCO but are available from the corresponding author on reasonable request.

References

Adhikari B, Peltomaki J, Puura J, Huttunen H (2018) Faster bounding box annotation for object detection in indoor scenes. In: 2018 7th European Workshop on Visual Information Processing (EUVIP), pp 1–6
Adhikari B, Huttunen H (2021) Iterative bounding box annotation for object detection. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp 4040–4046
Adhikari B, Rahtu E, Huttunen H (2021) Sample selection for efficient image annotation. In: 2021 9th European Workshop on Visual Information Processing (EUVIP), pp 1–6
Akhilesh K, Sedamkar RR (2016) Automatic image annotation using an ant colony optimization algorithm (aco). In: 2016 IEEE 7th Power India International Conference (PIICON), pp 1–4
Anjum S, Verma A, Dang B, Gurari D (2021) Exploring the use of deep learning with crowdsourcing to annotate images. Human Comput 8 (2):76–106
Article Google Scholar
Bacanin N, Stoean R, Zivkovic M, Petrovic A, Rashid T A, Bezdan T (2021) Performance of a novel chaotic firefly algorithm with enhanced exploration for tackling global optimization problems: application for dropout regularization. Mathematics 9(21):2705
Article Google Scholar
Bacanin N, Budimirovic N, Strumberger I, Alrasheedi A F, Abouhawwash M (2022) Novel chaotic oppositional fruit fly optimization algorithm for feature selection applied on covid 19 patients’ health prediction. Plos one 17(10):e0275727
Article Google Scholar
Bakkouri I, Afdel K (2020) Computer-aided diagnosis (cad) system based on multi-layer feature fusion network for skin lesion recognition in dermoscopy images. Multimed Tools Applic 79(29):20,483–20,518
Article Google Scholar
Bakkouri I, Afdel K (2022) Mlca2f: multi-level context attentional feature fusion for covid-19 lesion segmentation from ct scans. Signal, Image and Video Processing, 1–8
Bearman A, Russakovsky O, Ferrari V et al (2016) What’s the point: semantic segmentation with point supervision. In: European conference on computer vision, pp 549–565
Bernal J, Histace A, Masana M et al (2019) Gtcreator: a flexible annotation tool for image-based datasets. Int J Comput Assist Radiol Surg 14(2):191–201
Article Google Scholar
Cai Z, Vasconcelos N (2018) Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Chandra A L, Desai S V, Balasubramanian V N et al (2020) Active learning with point supervision for cost-effective panicle detection in cereal crops. Plant Methods 16(1):1–16
Article Google Scholar
Chen K, Wang J, Pang J et al (2019) Mmdetection: open mmlab detection toolbox and benchmark. ar**v:1906.07155
Chen L, Yang T, Zhang X et al (2021) Points as queries: weakly semi-supervised object detection by points. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8823–8832
Cinbis R G, Verbeek J, Schmid C (2016) Weakly supervised object localization with multi-fold multiple instance learning. IEEE Trans Pattern Anal Mach Intell 39(1):189–203
Article Google Scholar
De Boer MHT, Bouma H, Kruithof M et al (2019) Rapid annotation tool to train novel concept detectors with active learning. In: MMEDIA 2019: international conference on advances in multimedia, pp 36–41
Gao W, Wan F, Yue J et al (2022) Discrepant multiple instance learning for weakly supervised object detection. Pattern Recogn 122:108233
Article Google Scholar
Groenen I, Rudinac S, Worring M (2022) Panorams: automatic annotation for detecting objects in urban context. ar**v:2208.14295
Gygli M, Ferrari V (2020) Efficient object annotation via speaking and pointing. Int J Comput Vision 128(5):1061–1075
Article Google Scholar
Han J, Xu M, Li X et al (2014) Interactive object-based image retrieval and annotation on ipad. Multimed Tools Applic 72(3):2275–2297
Article Google Scholar
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang Z, Zou Y et al (2020) Comprehensive attention self-distillation for weakly-supervised object detection. Adv Neur Inform Process Syst 33:16797–16807
Google Scholar
Ince K G, Koksal A, Fazla A et al (2021) Semi-automatic annotation for visual object tracking. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1233–1239
Jeong J, Lee S, Kim J et al (2019) Consistency-based semi-supervised learning for object detection. Adv Neur Inform Process Syst 32:3–6
Google Scholar
Jiang B, Luo R, Mao J et al (2018) Acquisition of localization confidence for accurate object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 784–799
Kiyokawa T, Tomochika K, Takamatsu J et al (2019) Efficient collection and automatic annotation of real-world object images by taking advantage of post-diminished multiple visual markers. Adv Robot 33(24):1264–1280
Article Google Scholar
Kiyokawa T, Tomochika K, Takamatsu J et al (2019) Fully automated annotation with noise-masked visual markers for deep-learning-based object detection. IEEE Robot Autom Lett 4(2):1972–1977
Article Google Scholar
Konyushkova K, Uijlings J, Lampert C H et al (2018) Learning intelligent dialogs for bounding box annotation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9175–9184
Li X, Yi S, Zhang R et al (2022) Dynamic sample weighting for weakly supervised object detection. Image Vis Comput 122:104444
Article Google Scholar
Lin T-Y, Maire M, Belongie S et al (2014) Microsoft coco: common objects in context. In: European conference on computer vision, pp 740–755
Lin D, Dai J, Jia J et al (2016) Scribblesup: scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3159–3167
Lin T-Y, Dollár P, Girshick R et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Lin T-Y, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Malakar S, Ghosh M, Bhowmik S et al (2020) A ga based hierarchical feature selection approach for handwritten word recognition. Neural Comput Appl 32(7):2533–2552
Article Google Scholar
Papadopoulos D P, Clarke Alasdair DF, Keller F et al (2014) Training object class detectors from eye tracking data. In: European conference on computer vision, pp 361–376
Papadopoulos D P, Uijlings JRR, Keller F et al (2016) We don’t need no bounding-boxes: training object class detectors using only human verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 854–863
Papadopoulos D P, Uijlings JRR, Keller F et al (2017) Extreme clicking for efficient object annotation. In: Proceedings of the IEEE international conference on computer vision, pp 4930–4939
Papadopoulos D P, Uijlings JRR, Keller F et al (2017) Training object class detectors with click supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6374–6383
Ren S, He K, Girshick R et al (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28:91–99
Google Scholar
Ren Z, Yu Z, Yang X et al (2020) Ufo²: a unified framework towards omni-supervised object detection. In: European conference on computer vision, pp 288–313
Rochan M, Rahman S, Bruce ND et al (2016) Weakly supervised object localization and segmentation in videos. Image Vis Comput 56:1–12
Article Google Scholar
Russakovsky O, Deng J, Su H et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Russakovsky O, Li L-J, Fei-Fei L (2015) Best of both worlds: human-machine collaboration for object annotation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2121–2131
Russell B C, Torralba A, Murphy K P et al (2008) Labelme: a database and web-based tool for image annotation. Int J Comput Vis 77(1):157–173
Article Google Scholar
Sohn K, Berthelot D, Carlini N et al (2020) Fixmatch: simplifying semi-supervised learning with consistency and confidence. Adv Neur Inform Process Syst 33:596–608
Google Scholar
Sohn K, Zhang Z, Li C-L et al (2020) A simple semi-supervised learning framework for object detection. ar**v:2005.04757
Su H, Deng J, Fei-Fei L (2012) Crowdsourcing annotations for visual object detection. In: Workshops at the twenty-sixth AAAI conference on artificial intelligence, pp 4–5
Tang P, Wang X, Bai S et al (2018) Pcl: proposal cluster learning for weakly supervised object detection. IEEE Trans Pattern Anal Mach Intell 42(1):176–191
Article Google Scholar
Tarvainen A, Valpola H (2017) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. Adv Neur Inform Process Syst 30:1195–1204
Google Scholar
Tian Z, Shen C, Chen H et al (2019) Fcos: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9627–9636
Uijlings JRR, Andriluka M, Ferrari V (2020) Panoptic image annotation with a collaborative assistant. In: Proceedings of the 28th ACM international conference on multimedia, pp 3302–3310
Ries C X, Richter F, Lienhart R (2016) Towards automatic bounding box annotations from weakly labeled images. Multimed Tools Applic 75 (11):6091–6118
Article Google Scholar
Wang C, Huang K, Ren W et al (2015) Large-scale weakly supervised object localization via latent category learning. IEEE Trans Image Process 24 (4):1371–1385
Article MathSciNet Google Scholar
Wang X, **ang X, Zhang B et al (2022) Weakly supervised object detection based on active learning. Neural Process Lett 54(6):5169–5183
Article Google Scholar
Wu S, Li X, Wang X (2020) Iou-aware single-stage object detector for accurate localization. Image Vis Comput 97:103,911
Article Google Scholar
Xu J, Schwing A G, Urtasun R (2015) Learning to segment under various forms of weak supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3781–3790
Xu M, Zhang Z, Hu H et al (2021) End-to-end semi-supervised object detection with soft teacher. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3060–3069
Zhang H, Wang Y, Dayoub F et al (2021) Varifocalnet: an iou-aware dense object detector. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8514–8523
Zhang Y-F, Ren W, Zhang Z et al (2022) Focal and efficient iou loss for accurate bounding box regression. Neurocomputing 506:146–157
Article Google Scholar
Zhou Q, Yu C, Wang Z et al (2021) Instant-teaching: an end-to-end semi-supervised object detection framework. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4081–4090
Zhou L, Chang H, Ma B et al (2022) Interactive regression and classification for dense object detector. IEEE Trans Image Process 31:3684–3696
Article Google Scholar
Zitnick C L, Dollár P (2014) Edge boxes: locating object proposals from edges. In: European conference on computer vision, pp 391–405
Zoph B, Ghiasi G, Lin T-Y et al (2020) Rethinking pre-training and self-training. Adv Neur Inform Process Syst 33:3833–3845
Google Scholar

Download references

Acknowledgments

This work was supported by the NSFC fund (62171288), Shenzhen Fundamental Research fund under Grant 20200810150441003 and JCYJ20190808143415801, and the Guangdong Basic and Applied Basic Research Foundation under Grant 2020A1515011559 and 2021A1515012287.

Author information

Authors and Affiliations

College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen, 518000, China
**ngzheng Wang, Guoyao Wei, Songwei Chen & Jiehao Liu

Authors

**ngzheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guoyao Wei
View author publications
You can also search for this author in PubMed Google Scholar
Songwei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jiehao Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to **ngzheng Wang.

Ethics declarations

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, X., Wei, G., Chen, S. et al. An efficient weakly semi-supervised method for object automated annotation. Multimed Tools Appl 83, 9417–9440 (2024). https://doi.org/10.1007/s11042-023-15305-0

Download citation

Received: 24 November 2022
Revised: 09 March 2023
Accepted: 06 April 2023
Published: 17 June 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-15305-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient weakly semi-supervised method for object automated annotation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

C-WSL: Count-Guided Weakly Supervised Localization

Many-Shot from Low-Shot: Learning to Annotate Using Mixed Supervision for Object Detection

Learning Semantic Correspondence with Sparse Annotations

Data Availability Statement

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An efficient weakly semi-supervised method for object automated annotation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

C-WSL: Count-Guided Weakly Supervised Localization

Many-Shot from Low-Shot: Learning to Annotate Using Mixed Supervision for Object Detection

Learning Semantic Correspondence with Sparse Annotations

Data Availability Statement

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation