Abstract
Artificial intelligence has experienced significant growth in recent decades. As a result, several architectures have been developed for object detection, classification, and recognition. Currently, there are several alternatives that fulfill these purposes; however, there is no rigid framework defining how these architectures are formed. This work presents an updated performance analysis of real-time vehicle detection and counting using You Only Look Once (YOLOv8) version 8, RetinaNet (RN), and Single Shot Detector (SSD). For such analysis, the Google Colaboratory was used as the main retraining environment. The Research-Action methodology was employed to develop the practical case, and a systematic literature review was also conducted to determine the state of the art in this problem domain. For feature extraction, RESNET-50 and MobileNet were used in RN and SSD, respectively. The results indicated that YOLOv8 (which has undergone the most adjustments since its inception) exhibits the best performance in terms of detection time, precision considering the frames that need to be analyzed to enable real-time usage, and ease of implementation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Roberts, L.G.: Machine perception of three-dimensional solids. In: Outstanding Dissertations in the Computer Sciences (1963). http://hdl.handle.net/1721.1/11589. Accessed 04 June 2023
Papert, S.A.: The Summer Vision Project, Massachusetts, July 1966. http://hdl.handle.net/1721.1/6125. Accessed 04 June 2023
Sze, V., Chen, Y.-H., Emer, J., Suleiman, A., Zhang, Z.: Hardware for Machine Learning: Challenges and Opportunities (2017)
Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980). https://doi.org/10.1007/BF00344251
Zou, Z., Shi, Z., Guo, Y., Ye, J.: Object Detection in 20 Years: A Survey. CoRR, vol. abs/1905.05055 (2019): http://arxiv.org/abs/1905.05055
Zhang, H., Cloutier, R.: Review on one-stage object detection based on deep learning. EAI Endorsed Trans. e-Learn. 7, 174181 (2022). https://doi.org/10.4108/eai.9-6-2022.174181
Dodge, J., Ilharco, G., Schwartz, R., Farhadi, A., Hajishirzi, H., Smith, N.: Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stop** (2020). https://github.com/huggingface/. Accessed 04 June 2023
Williams, R.I., Clark, L.A., Clark, W.R., Raffo, D.M.: Re-examining systematic literature review in management research: additional benefits and execution protocols. Eur. Manag. J. 39(4), 521–533 (2021). https://doi.org/10.1016/J.EMJ.2020.09.007
Benjdira, B., Khursheed, T., Koubaa, A., Ammar, A., Ouni, K.: Car Detection using Unmanned Aerial Vehicles: Comparison between Faster R-CNN and YOLOv3 (2018)
Lin, C.-J., Jeng, S.-Y., Lioa, H.-W.: A Real-Time Vehicle Counting, Speed Estimation, and Classification System Based on Virtual Detection Zone and YOLO (2021). https://doi.org/10.1155/2021/1577614
Zaghari, N., Fathy, M., Jameii, S.M., Shahverdy, M.: The improvement in obstacle detection in autonomous vehicles using YOLO non-maximum suppression fuzzy algorithm. J. Supercomput. 77(11), 13421–13446 (2021). https://doi.org/10.1007/s11227-021-03813-5
Voulodimos, A., Doulamis, N., Doulamis, A., Protopapadakis, E.:: Deep Learning for Computer Vision: A Brief Review (2018). https://doi.org/10.1155/2018/7068349
Tan, L., Huangfu, T., Wu, L., Chen, W.: Comparison of RetinaNet, SSD, and YOLO v3 for real-time pill identification. BMC Med. Inform. Decis. Mak. 21(1), 324 (2021). https://doi.org/10.1186/s12911-021-01691-8
Almeida, J., Guamán, S., Yoo, S.G.: Vechicle counting system in urban areas: a practical case. In: 2022 IEEE 7th International conference for Convergence in Technology (I2CT), 2022, pp. 1–6 (2022). https://doi.org/10.1109/I2CT54291.2022.9823982
Altrichter, H., Kemmis, S., McTaggart, R., Zuber-Skerritt, O.: The concept of action research. Learn. Organ. 9(3), 125–131 (2002). https://doi.org/10.1108/09696470210428840
Zhao, Z.-Q., Zheng, P., Xu, S.-T., Wu, X.: Object detection with deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019). https://doi.org/10.1109/TNNLS.2018.2876865
Bhatt, D., et al.: Electronics CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope (2021). https://doi.org/10.3390/electronics10202470
Liu, W., et al.: SSD: Single Shot MultiBox Detector, CoRR, vol. abs/1512.02325 (2015). http://arxiv.org/abs/1512.02325
Cheng, C.: Real-time mask detection based on SSD-MobileNetV2. In: 2022 IEEE 5th International Conference on Automation, Electronics and Electrical Engineering (AUTEEE), 2022, pp. 761–767 (2022). https://doi.org/10.1109/AUTEEE56487.2022.9994442
Howard, A.G., et al.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, CoRR, vol. abs/1704.04861 (2017). http://arxiv.org/abs/1704.04861
Lin, T.-Y., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature Pyramid Networks for Object Detection, CoRR, vol. abs/1612.03144 (2016). http://arxiv.org/abs/1612.03144
He, K., Zhang, X., Ren, S., Sun, J.: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, CoRR, vol. abs/1406.4729 (2014). http://arxiv.org/abs/1406.4729
Terven, J.R., Cordova-Esparaza, D.M.: A comprehensive review of yolo: from YOLOV1 and beyond under review in ACM computing surveys (2023)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger (2016). http://pjreddie.com/yolo9000/. Accessed 04 June 2023
Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement (2018). https://pjreddie.com/yolo/. Accessed 04 June 2023
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal Loss for Dense Object Detection (2018)
Zhang, L., et al.: Vehicle object detection based on improved RetinaNet. J. Phys. Conf. Ser. 1757(1), 12070 (2021). https://doi.org/10.1088/1742-6596/1757/1/012070
Wightman, R., Touvron, H., Jégou, H.: ResNet strikes back: an improved training procedure in timm (2021)
Chhabra, S., Singh, A.K.: A comprehensive vision on cloud computing environment: emerging challenges and future research directions a preprint (2022)
Rajani, V., Garg, D.: Types for Information Flow Control: Labeling Granularity and Semantic Models (2018)
Padilla, R., Netto, S.L., da Silva, E.A.B.: A survey on performance metrics for object-detection algorithms. In: 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), 2020, pp. 237–242 (2020). https://doi.org/10.1109/IWSSIP48289.2020.9145130
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Buitrón, I.A., Yoo, S.G. (2023). Performance Analysis of You Only Look Once, RetinaNet, and Single Shot Detector Applied to Vehicle Detection and Counting. In: Maldonado-Mahauad, J., Herrera-Tapia, J., Zambrano-Martínez, J.L., Berrezueta, S. (eds) Information and Communication Technologies. TICEC 2023. Communications in Computer and Information Science, vol 1885. Springer, Cham. https://doi.org/10.1007/978-3-031-45438-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-45438-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45437-0
Online ISBN: 978-3-031-45438-7
eBook Packages: Computer ScienceComputer Science (R0)