Neural Network Technologies for Detection and Classification of Objects

Borzov, S. M.; Nezhevenko, E. S.

doi:10.3103/S8756699023030032

Neural Network Technologies for Detection and Classification of Objects

Published: 30 October 2023

Volume 59, pages 329–345, (2023)
Cite this article

Optoelectronics, Instrumentation and Data Processing Aims and scope

S. M. Borzov¹ &
E. S. Nezhevenko¹

55 Accesses
Explore all metrics

Abstract

We present a review of the basic ideas used in solving the problems of detecting and classifying objects by their images using neural network technologies. The key publications on the most popular ways to improve classification accuracy are considered. It is shown that in the last decade, neural network methods for detecting objects have achieved significant success by using convolution technologies and applying deep learning with large databases. The main shortcomings, limitations and possible directions for the improvement of existing approaches are analyzed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

REFERENCES

Methods of Computer Image Processing, Ed. by V. A. Soifer (Fizmatlit, Moscow, 2003).
Google Scholar
A. A. Luk’yanitsa and A. G. Shishkin, Digital Video Image Processing (Ai-Es-Es Press, Moscow, 2009).
Google Scholar
D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach (Prentice Hall, 2002).
Google Scholar
G. Stockman and L. G. Shapiro, Computer Vision (Prentice Hall, 2006).
Google Scholar
I. S. Gruzman, V. S. Kirichuk, V. P. Kosykh, G. I. Peretyagin, and A. A. Spektor, Digital Image Processing in Information Systems (Novosibirsk. Gos. Tekh. Univ., Novosibirsk, 2002).
Yu. I. Zhuravlev, V. V. Ryazanov, and O. V. Sen’ko, Mathematical Methods: Program System: Practical Applications (Fazis, Moscow, 2006).
Google Scholar
P. Viola and M. Jones, ‘‘Rapid object detection using a boosted cascade of simple features,’’ in Proc. 2001 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. CVPR 2001, Kanai, Hawaii, 2001 (IEEE, 2001), pp. 511–518. https://doi.org/10.1109/cvpr.2001.990517
N. Dalal and B. Triggs, ‘‘Histograms of oriented gradients for human detection,’’ in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, Calif., 2005 (IEEE, 2005), pp. 886–893. https://doi.org/10.1109/cvpr.2005.177
J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei, ‘‘ImageNet: A large-scale hierarchical image database,’’ in 2009 IEEE Conf. on Computer Vision and Pattern Recognition, Miami, 2009 (IEEE, 2009), pp. 248–255. https://doi.org/10.1109/cvpr.2009.5206848
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. Berg, and L. Fei-Fei, ‘‘ImageNet Large Scale Visual Recognition Challenge,’’ Int. J. Comput. Vision 115, 211-252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
M. Everingham, S. Eslami, L. Van Gool, C. Williams, J. Winn, and A. Zisserman, ‘‘The Pascal Visual Object Classes Challenge: A retrospective,’’ Int. J. Comput. Vision 111, 98-136 (2015). https://doi.org/10.1007/s11263-014-0733-5
Article Google Scholar
S. Mann, ‘‘Glasseyes: The theory of eyetap digital eye glass,’’ New Engl. J. Med. 31 (3), 10–14 (2012). http://wearcam.org/glass.pdf. Cited January 17, 2022.
Development Edition, Microsoft official site, (2016). https://www.microsoft.com/microsoft-hololens/en-us. Cited November 19, 2021.
Meet Kinect for Windows, Microsoft official site, (2016). https://dev.windows.com/en-us/kinect. Cited November 19, 2021.
PyTorch, https://pytorch.org/. Cited January 14, 2022.
Vuforia 5.5 SDK, Vuforia Developer Portal, (2016). https://developer.vuforia.com/downloads/sdk. Cited November 12, 2021.
Kudan SDK 1.2.3 version, Kudan Augmented Reality, (2016). https://www.kudan.eu/download/. Cited November 17, 2021.
D.-H. Kenneth, A Practical Introduction to Computer Vision with OpenCV (Trinity College Dublin, Dublin, 2014).
Google Scholar
Z. Zou, K. Chen, Z. Shi, Yu. Guo, and J. Ye, ‘‘Object detection in 20 years: A survey,’’ Proc. IEEE 111, 257–263 (2023). https://doi.org/10.1109/JPROC.2023.3238524
Article Google Scholar
N. A. Andriyanov, V. E. Dement’ev, and A. G. Tashlinskii, ‘‘Detection of objects in the images: from likelihood relationships towards scalable and efficient neural networks,’’ Komp’yuternaya Opt. 46, 139–159 (2022). https://doi.org/10.18287/2412-6179-CO-922
Article ADS Google Scholar
Tensor Flow 2 detection zoo, https://github.com/tensorflow/models/blob/master/research/object_detec- tion/g3doc/tf2_detection_zoo.md. Cited January 27, 2023.
The Neural Network Zoo, https://www.asimovinstitute.org/neural-network-zoo/. Cited January 27, 2023.
P. D. Wassermen, Neural Computing: Theory and Practice (Van Nostrand Reinhold, 1989).
‘‘Areas of application of neural networks: Classification of neural networks: Review and analysis of neural networks,’’ https://studbooks.net/2030598/informatika/oblasti_primeneniya_neyronnyh_setey. Cited January 27, 2023.
S. G. Nikolaeva, Neural Networks: MATLAB Implementation: Textbook (Kazan. Gos. Energ. Univ., Kazan, 2015).
A. I. Galushkin, Synthesis of Multilayer Pattern Recognition Systems (Energiya, Moscow, 1974).
Google Scholar
P. J. Werbos, ‘‘Beyond regression: New tools for prediction and analysis in the behavioral sciences,’’ in PhD Thesis (Harvard Univ., Cambridge, 1974), pp. 453.
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, ‘‘Learning internal representations by error propagation,’’ in Parallel Distributed Processing (MIT Press, Cambridge, 1986), pp. 318-362.
Book Google Scholar
CIFAR-10 and CIFAR-100 datasets, https://www.cs.toronto.edu/kriz/cifar.html. Cited January 27, 2023.
A. P. Vezhnevets, ‘‘Methods of supervised classification by precedents in taks of object recognition in images,’’ (Lab. Komp’yut. Grafiki i Mul’timedia Fakul’teta VMiK, Mosk. Gos. Univ. im. M.V. Lomonosova, Moscow, 2006). http://www.graphicon.ru/2006/fr10_34_VezhnevetsA.pdf. Cited January 27, 2023.
V. Badrinarayanan, A. Kendall, and R. Cipolla, ‘‘SegNet: A deep convolutional encoder-decoder architecture for image segmentation,’’ IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017). https://doi.org/10.1109/tpami.2016.2644615
Article Google Scholar
O. Cicek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, ‘‘U-Net: Learning dense volumetric segmentation from sparse annotation,’’ in Medical Image Computing and Computer-Assisted Intervention (MICCAI 2016), Ed. by S. Ourselin, L. Joskowicz, M. Sabuncu, G. Unal, and W. Wells (Springer, Cham, 2016), pp. 424–432. https://doi.org/10.1007/978-3-319-46723-8_49
V. I. Kozik and E. S. Nezhevenko, ‘‘Classification of hyperspectral images using conventional neural networks,’’ Optoelectron., Instrum. Data Process. 57, 123–131 (2021). https://doi.org/10.3103/S8756699021020102
Article ADS Google Scholar
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, ‘‘Gradient-based learning applied to document recognition,’’ Proc. IEEE 86, 2278–2324 (1998). https://doi.org/10.1109/5.726791
Article Google Scholar
A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification with deep convolutional neural networks,’’ Commun. ACM 60 (6), 84–90 (2012). https://doi.org/10.1145/3065386
Article Google Scholar
G. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, ‘‘Improving neural networks by preventing co-adaptation of feature detectors,’’ (2012). https://doi.org/10.48550/ar**v.1207.0580
S. M. Borzov, A. V. Karpov, O. I. Potaturkin, and A. O. Hadziev, ‘‘Application of neural networks for differential diagnosis of pulmonary pathologies based on X-ray images,’’ Optoelectron., Instrum. Data Process. 58, 257–265 (2022). https://doi.org/10.3103/S8756699022030013
Article ADS Google Scholar
A. V. Karpov, V. I. Kozik, E. S. Nezhevenko, and Y. Sh. Schwartz, ‘‘On the influence of the quality of databases of X-Ray images of patients with tuberculosis on the diagnostics of deceases,’’ Optoelectron., Instrum. Data Process. 58, 487–494 (2022). https://doi.org/10.3103/S8756699022050065
Article ADS Google Scholar
K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for large-scale image recognition,’’ (2014). https://doi.org/10.48550/ar**v.1409.1556
Milyutin, I., ‘‘VGG16 is a convolutional neural network for extracting attributes of images,’’ https://neurohive.io/ru/vidy-nejrosetej/vgg16-model/?. Cited January 27, 2023.
M. Lin, Q. Chen, and Sh. Yan, ‘‘Network in network,’’ (2014). https://doi.org/10.48550/ar**v.1312.4400
K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image recognition,’’ (2015). https://doi.org/10.48550/ar**v.1512.03385
C. Szegedy, W. Liu, Ya. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with convolutions,’’ in 2015 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, 2015 (IEEE, 2015), pp. 1–9. https://doi.org/10.1109/cvpr.2015.7298594
A. Veit, M. Wilber, and S. Belongie, ‘‘Residual networks behave like ensembles of relatively shallow networks,’’ (2016). https://doi.org/10.48550/ar**v.1605.06431
G. Huang, Zh. Liu, L. Van Der Maaten, and K. Q. Weinberger, ‘‘Densely connected convolutional networks,’’ in 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, 2017 (IEEE, 2017), pp. 2261–2269. https://doi.org/10.1109/cvpr.2017.243
M. Tan and Q. V. Le, ‘‘EfficientNet: Rethinking model scaling for convolutional neural networks. machine learning,’’ (2020). https://doi.org/10.48550/ar**v.1905.11946
M. Tan, R. Pang, and Q. V. Le, ‘‘EfficientDet: Scalable and efficient object detection,’’ in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 2020 (IEEE, 2020), pp. 10778–10787. https://doi.org/10.1109/cvpr42600.2020.01079
P. Ramachandran, B. Zoph, and Q. V. Le, ‘‘Searching for activation functions,’’ in Proc. of the Int. Conf. on Learning Representations (ICLR Workshop), Vancouver, 2018 (2018). https://openreview.net/forum?id=SkBYYyZRZ. Cited January 27, 2023.
Yu. Chen, T. Yang, X. Zhang, G. Meng, X. **ao, and J. Sun, ‘‘DetNAS: Backbone search for object detection,’’ 596 (2019).
T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, ‘‘Microsoft COCO: Common objects in context,’’ in Computer Vision–ECCV 2014, Ed. by D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars (Springer, Cham, 2014), pp. 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
L. Yao, H. Xu, W. Zhang, X. Liang, and Zh. Li, ‘‘SM-NAS: Structural-to-modular neural architecture search for object detection,’’ Proc. AAAI Conf. Artif. Intell. 34, 12661–12668 (2020). https://doi.org/10.1609/aaai.v34i07.6958
V. D. Nogin, Pareto Set and Principle (Izdatel’sko-Poligraficheskaya Assotsiatsiya Vysshikh Uchebnykh Zavedenii, 2022).
Google Scholar
Yo. Freund and R. E. Schapire, ‘‘A short introduction to boosting,’’ J. Jpn. Soc. Artif. Intell. 14, 771–780 (1999).
Google Scholar
J. Sochman and J. Matas, AdaBoost. Center for Machine Perception (Czech Tech. Univ., Prague, 2010). https://cmp.felk.cvut.cz/simsochmj1/adaboost_talk.pdf. Cited January 27, 2023.
L. V. Utkin and M. A. Ryabinin, ‘‘A Siamese deep forest,’’ Knowl.-Based Syst. 139, 13–22 (2018). https://doi.org/10.1016/j.knosys.2017.10.006
Article Google Scholar
R. Girshick, J. Donahue, T. Darrell, and J. Malik, ‘‘Rich feature hierarchies for accurate object detection and semantic segmentation,’’ in 2014 IEEE Conf. on Computer Vision and Pattern Recognition (IEEE, Columbus, Ohio, 2014, 2014), pp. 580–587. https://doi.org/10.1109/cvpr.2014.81
R. Girshick, ‘‘Fast R-CNN,’’ in 2015 IEEE Int. Conf. on Computer Vision (ICCV) (IEEE, Santiago, Chile, 2015, 2015), pp. 1440–1448. https://doi.org/10.1109/iccv.2015.169
S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards real-time object detection with region proposal networks,’’ IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2015). https://doi.org/10.1109/tpami.2016.2577031
Article Google Scholar
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, ‘‘You Only Look Once: Unified, Real-Time Object Detection,’’ in 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) (IEEE, Las Vegas, 2016, 2016), pp. 779–788. https://doi.org/10.1109/cvpr.2016.91
J. Redmon and A. Farhadi, ‘‘YOLOv3: An incremental improvement,’’ (2018). https://doi.org/10.48550/ar**v.1804.02767
A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, ‘‘YOLOv4: Optimal speed and accuracy of object detection,’’ (2020). https://doi.org/10.48550/ar**v.2004.10934
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, and A. C. Berg, ‘‘SSD: Single shot multibox detector,’’ in Computer Vision–ECCV 2016, Ed. by B. Leibe, J. Matas, N. Sebe, and M. Welling (Springer, Cham, 2016), pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

Download references

Funding

The research was carried out within the state assignment of Ministry of Science and Higher Education of the Russian Federation (project no. 121022000116-0) at the Institute of Automation and Electrometry of the Siberian Branch of the Russian Academy of Sciences.

Author information

Authors and Affiliations

Institute of Automation and Electrometry, Siberian Branch, Russian Academy of Sciences, 630090, Novosibirsk, Russia
S. M. Borzov & E. S. Nezhevenko

Authors

S. M. Borzov
View author publications
You can also search for this author in PubMed Google Scholar
E. S. Nezhevenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to S. M. Borzov or E. S. Nezhevenko.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Translated by L. Trubitsyna

About this article

Cite this article

Borzov, S.M., Nezhevenko, E.S. Neural Network Technologies for Detection and Classification of Objects. Optoelectron.Instrument.Proc. 59, 329–345 (2023). https://doi.org/10.3103/S8756699023030032

Download citation

Received: 27 January 2023
Revised: 02 February 2023
Accepted: 17 February 2023
Published: 30 October 2023
Issue Date: June 2023
DOI: https://doi.org/10.3103/S8756699023030032

Keywords:

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

Neural Network Technologies for Detection and Classification of Objects

Abstract

Access this article

Subscribe and save

Buy Now

REFERENCES

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Additional information

About this article

Cite this article

Share this article

Keywords:

Subscribe and save

Buy Now

Search

Navigation