Generating robust real-time object detector with uncertainty via virtual adversarial training

Chen, Yipeng; Xu, Ke; He, Di; Ban, **aojuan

doi:10.1007/s13042-021-01416-3

Generating robust real-time object detector with uncertainty via virtual adversarial training

Original Article
Published: 30 August 2021

Volume 13, pages 431–445, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yipeng Chen^1,2,
Ke Xu ORCID: orcid.org/0000-0003-1809-7413¹,
Di He¹ &
…
**aojuan Ban²

448 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Despite remarkable accuracy improvement in convolutional neural networks (CNNs) based object detectors, there are still some problems in applying on some safety–critical domain, such as the self-driving domain, in part due to the complexity of verifying the correctness of detecting results and the lack of safety guarantees. By simply modeling the bounding box parameters with a Gaussian distribution in a real-time object detector, we propose a new method for predicting uncertainty, which can quantify the reliability of the neural networks’ prediction, to validate the correctness of detecting results with low computational complexity. In addition, we redesign the loss function by adding a new regularization term, called virtual adversarial training (VAT). The use of VAT, which is defined as the robustness of the conditional label distribution around input data against local perturbation, can smooth the output distribution robust with lower uncertainty and the prediction from the regularized model will be better. In consideration of the trade-off between the size and speed, we choose some lightweight models as the backbone of a YOLOv3 detector and the experimental results on PASCAL VOC dataset and MS COCO demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Towards Practical Robustness Improvement for Object Detection in Safety-Critical Scenarios

Label-Free Robustness Estimation of Object Detection CNNs for Autonomous Driving Applications

Article 11 January 2021

Safety-Aware Hardening of 3D Object Detection Neural Network Systems

References

Kaiming H, **angyu Z, Shaoqing R, Jian Sun (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778
Christian S et al (2015) Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–9
Karen S and Andrew Z (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations (ICLR)
Jie H, Li S, Gang S (2018) Squeeze-and-excitation networks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 7132–7141
Yunpeng C et al (2017) Dual path networks. In: Advances in Neural Information Processing Systems (NIPS), pp 4467–4475
Saining X et al (2017) Aggregated residual transformations for deep neural networks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 5987–5995
Mark E et al (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Tsung-Yi L et al (2014) Microsoft COCO: common objects in context. In: 2014 European Conference on Computer Vision (ECCV), pp 740–755
Joseph R, Ali F (2018) YOLOv3: an incremental improvement. CoRR. ar**v:1804.02767
Shaoqing R, Kaiming H, Ross G, Jian S (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Ross G (2015) Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp 1440–1448
Wei L et al (2016) SSD: single shot multibox detector. European conference on computer vision (ECCV). Springer, Cham, pp 21–37
Google Scholar
Shifeng Z, Longyin W, **ao B, Zhen L, Stan L (2018) Single-shot refinement neural network for object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp 4203–4212
Peng Z, Bingbing N, Cong G, Jianguo H, Yi X (2018) Scale-transferrable object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 528–537
Mark S, Andrew G.H, Menglong Z, Andrey Z, Liang-Chieh C (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4510–4520
Ningning M, **angyu Z, Hai-Tao Z, Jian S (2018) ShuffleNet V2: practical guidelines for efficient cnn architecture design. In: 2018 European Conference on Computer Vision (ECCV), pp 122–138
Forrest NI (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR. ar**v:1602.07360
François C (2017) Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1800–1807
Alex K, Yarin G (2017) What uncertainties do we need in Bayesian deep learning for computer vision? In: The Advances in Neural Information Processing Systems(NIPS), pp 5574–5584
Yarin G, Zoubin G (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: 2016 International Conference on Machine Learning (ICML), pp 1050–1059
Sungjoon C, Kyungjae L, Sungbin L, Songhwai O (2018) Uncertainty-aware learning from demonstration using mixture density networks with sampling-free variance modeling. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp 6915–6922
Yihui H, Chenchen Z, Jianren W, Marios S, **angyu Z (2019) Bounding box regression with uncertainty for accurate object detection. In: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2888–2897
Ian JG, Jonathon S, Christian S (2015) Explaining and harnessing adversarial examples. In: 2015 International Conference on Learning Representations (ICLR)
Yinpeng D et al (2018) Boosting adversarial attacks with momentum. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 9185–9193
Aleksander M et al (2018) Towards deep learning models resistant to adversarial attacks. In: 2018 International Conference on Learning Representations (ICLR)
Florian T et al (2018) Ensemble adversarial training: attacks and defenses. In: 2018 International Conference on Learning Representations (ICLR)
Takeru M, Shin-ichi M, Masanori K, Shin I (2019) Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning. IEEE Trans Pattern Anal Mach Intell 41(8):1979–1993
Article Google Scholar
Cihang X et al. (2017) adversarial examples for semantic segmentation and object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp 1378–1387
**ngxing W, Siyuan L, Ning C, **aochun C (2019) Transferable adversarial attacks for image and video object detection. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI), pp 954–960
Ross BG, Jeff D, Trevor D, Jitendra M (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 580–587
Joseph R et al. (2016) You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 779–788
Tsung-Yi L et al (2017) Focal loss for dense object detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp 2999–3007
Joseph R, Ali F (2017) YOLO9000: Better, Faster, Stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6517–6525.
Andrew GH et al (2017) MobileNets: efficient convolutional neural networks for mobile vision. CoRR. ar**v:1704.04861
**angyu Z, **nyu Z, Mengxiao L, Jian S (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6848–6856
Mingxing T et al (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: 2019 Proceedings of the 36th International Conference on Machine Learning (ICML), pp 6105–6114
Rajat S et al (2020) ULSAM: ultra-lightweight subspace attention module for compact convolutional neural networks. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp 1616–1625
Fahimeh F et al (2020) Lightweight residual densely connected convolutional neural network. CoRR. ar**v:2001.00526
Xu M et al (2020) Cascaded context dependency: an extremely lightweight module for deep convolutional neural networks. In: 2020 IEEE International Conference on Image Processing (ICIP), pp 1741–1745
Shyh J et al (2021) A novel lightweight convolutional neural network, exquisiteNetV2. CoRR. ar**v:2105.09008
Charles B, Julien C, Koray K, Daan W (2015) Weight uncertainty in neural networks. CoRR. ar**v:1505.05424
Balaji L, Alexander P, Charles B (2017) Simple and scalable predictive uncertainty estimation using deep ensembles. In: Advances in Neural Information Processing Systems (NIPS), pp 6402–6413
Yarin G, Zoubin G (2016) Bayesian convolutional neural networks with bernoulli approximate variational inference. In: 2016 International Conference on Learning Representations (ICLR)
Kumar S et al (2018) Uncertainty estimations by softplus normalization in bayesian convolutional neural networks with variational inference. CoRR. ar**v:1806.05978
Lewis S, Yarin G (2018) Understanding measures of uncertainty for adversarial example detection. In: The Conference on Uncertainty in Artificial Intelligence (UAI), pp 560–569
Youngwan L et al (2020) Localization uncertainty estimation for anchor-free object detection. CoRR. ar**v:2006.15607
Zhi T et al (2019) FCOS: fully convolutional one-stage object detection. In: 2019 IEEE International Conference on Computer Vision (ICCV), pp 9626–9635
Yan L et al (2020) Loss rescaling by uncertainty inference for single-stage object detection. In: 2020 IEEE International Conference on Image Processing (ICIP), pp 698–702
Marius S et al (2020) MetaDetect: uncertainty quantification and prediction quality estimates for object detection. CoRR. ar**v:2010.01695
Shixiang G, Luca R (2015) Towards deep neural network architectures robust to adversarial examples. In: the workshop at 2015 International Conference on Learning Representations (ICLR).
Christian S et al (2014) Intriguing properties of neural networks. In: 2014 International Conference on Learning Representations (ICLR).
Philip B, Ouais A, Doina P (2014) Learning with Pseudo-Ensembles. Adv Neural Inf Process Syst (NIPS) 27:3365–3373
Google Scholar
Hongyi Z et al. (2018) mixup: beyond empirical risk minimization. In: 2018 International Conference on Learning Representations (ICLR)
Sangdoo Y et al (2019) CutMix: regularization strategy to train strong classifiers with localizable features. In: 2019 IEEE International Conference on Computer Vision (ICCV), pp 6022–6031
Yuxin W, Kaiming H (2018) Group normalization. In: 2016 European Conference on Computer Vision (ECCV), pp 3–19
Sergey I, Christian S (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: 2015 International Conference on Machine Learning (ICML), pp 448–456
Hamid R et al (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 658–666
Navaneeth B, Bharat S, Rama C, Larry SD (2017) Improving object detection with one line of code. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp 5562–5570
Xu H et al (2019) A Gaussian mixture model based combined resampling algorithm for classification of imbalanced credit data sets. Int J Mach Learn Cybern 10(12):3687–3699
Article Google Scholar
Habiba A et al (2019) Multi-level features fusion and selection for human gait recognition: an optimized framework of Bayesian model and binomial distribution. Int J Mach Learn Cybern 10(12):3601–3618
Article Google Scholar
Diederik PK, Jimmy B (2015) Adam: a method for stochastic optimization. In: 2015 International Conference on Learning Representations (ICLR).
Alexey B et al (2020) YOLOv4: optimal speed and accuracy of object detection. CoRR. ar**v:2004.10934

Download references

Acknowledgements

This work is sponsored by National Natural Science Foundation of China (No.51874022, No.51674031,) and National Key R&D Program of China (no.2018YFB0704304).

Author information

Authors and Affiliations

Collaborative Innovation Center of Steel Technology, University of Science and Technology Bei**g, Bei**g, China
Yipeng Chen, Ke Xu & Di He
School of Computer and Communication Engineering, University of Science and Technology Bei**g, Bei**g, China
Yipeng Chen & **aojuan Ban

Authors

Yipeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ke Xu
View author publications
You can also search for this author in PubMed Google Scholar
Di He
View author publications
You can also search for this author in PubMed Google Scholar
**aojuan Ban
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ke Xu or **aojuan Ban.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Y., Xu, K., He, D. et al. Generating robust real-time object detector with uncertainty via virtual adversarial training. Int. J. Mach. Learn. & Cyber. 13, 431–445 (2022). https://doi.org/10.1007/s13042-021-01416-3

Download citation

Received: 06 June 2020
Accepted: 20 August 2021
Published: 30 August 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s13042-021-01416-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generating robust real-time object detector with uncertainty via virtual adversarial training

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Towards Practical Robustness Improvement for Object Detection in Safety-Critical Scenarios

Label-Free Robustness Estimation of Object Detection CNNs for Autonomous Driving Applications

Safety-Aware Hardening of 3D Object Detection Neural Network Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Generating robust real-time object detector with uncertainty via virtual adversarial training

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Towards Practical Robustness Improvement for Object Detection in Safety-Critical Scenarios

Label-Free Robustness Estimation of Object Detection CNNs for Autonomous Driving Applications

Safety-Aware Hardening of 3D Object Detection Neural Network Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation