Recognition of Pedestrians’ Street-Crossing Intentions Based on Skeleton Features

Lu, Jushou; Chen, Hao; Bai, Yuchuan; Hu, Chuan; Zhang, **

doi:10.1007/s12204-024-2700-9

Recognition of Pedestrians’ Street-Crossing Intentions Based on Skeleton Features

基于骨架特征的行人过街意图识别

Published: 25 January 2024

(2024)
Cite this article

Journal of Shanghai Jiaotong University (Science) Aims and scope Submit manuscript

Jushou Lu (陆聚首)¹,
Hao Chen (陈浩)¹,
Yuchuan Bai (柏玉川)²,
Chuan Hu (胡川)³ &
…
** Zhang (张希)³

151 Accesses
Explore all metrics

Abstract

An integrated method is proposed to solve the problem of frequent conflicts between autonomous vehicles and pedestrians in the street crossing scene. The method involves pedestrian detection, tracking, and intention recognition. First, an enhanced YOLOv8 is introduced by combining the C2f_CA module to achieve accurate pedestrian detection, tracking and pose estimation. Second, a variety of intention recognition features are proposed to characterize the position and pose of pedestrians in spatial and time domains. Finally, by taking the feature data as input for the base learners, the intention classification model is proposed based on the Stacking model with SVM, KNN and random forest as the base learners and XGBoost as the meta learner. The experimental results show that the enhanced YOLOv8 improves the detection accuracy by 5.4% compared with the original model, and the intention recognition based on the Stacking model can achieve 94.0% accuracy on the JAAD dataset, which is improved by more than 3.4% compared with the existing intention recognition models. Furthermore, when different parts of a pedestrian are occluded, the accuracy of the Stacking model still reaches 65.8%–73.3%, which verifies the robustness of the proposed model. The proposed model provides reliable inputs for decision planning of autonomous vehicles, which is conducive to improving the safety of self-driving.

摘要

针对过街场景下智能车与行人冲突多发的情况,提出了一套针对行人检测、跟踪和意图识别的集成方法。首先提出基于C2f_CA 模块改进YOLOv8 模型完成对行人的准确检测、跟踪和姿态估计;然后提出多种意图识别特征在空间和时域关系下表征行人的位置与姿态;最后以特征数据为输入,基于以SVM、 KNN 和随机森林三者为基模型的Stacking 异质集成方法完成行人的意图识别建模。对上述模型进行实验验证,结果表明,改进后的YOLOv8 模型相较于原模型检测精度提高了 5.4%,基于Stacking 异质集成模型的行为意图识别在JAAD 数据集上可以达到94.0%的准确率,相比于现有的意图识别模型提升了3.4%以上;在行人不同部位被遮挡的情况下,模型的准确率依旧达到65.8%∼73.3%,验证了该方法的鲁棒性。该方法为自动驾驶汽车的决策规划提供了可靠的输入, 有利于提高自动驾驶的安全性。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Understanding Pedestrians’ Car-Hailing Intention in Traffic Scenes

Article 10 August 2022

Local and Global Contextual Features Fusion for Pedestrian Intention Prediction

Comprehensive evaluation of skeleton features-based fall detection from Microsoft Kinect v2

Article 15 May 2019

References

HU Y C, LI M K. Challenges and responses of self-driving vehicles to road traffic safety law [J]. Journal of Shanghai Jiao Tong University (Philosophy and Social Sciences), 2019, 27(1): 44–53 (in Chinese).
Google Scholar
ZHOU M C. Criminal liability of traffic accident caused by self-driving vehicles [J]. Journal of Shanghai Jiao Tong University (Philosophy and Social Sciences), 2019, 27(1): 36–43 (in Chinese).
Google Scholar
YANG B, FAN F C, YANG J C, et al. Recognition of pedestrians’ street-crossing intentions based on action prediction and environment context [J]. Automotive Engineering, 2021, 43(7): 1066–1076 (in Chinese).
Google Scholar
WANG R P, CUI Y, SONG X A, et al. Multi-information-based convolutional neural network with attention mechanism for pedestrian trajectory prediction [J]. Image and Vision Computing, 2021, 107: 104110.
Article Google Scholar
ABUGHALIEH K M, ALAWNEH S G. Predicting pedestrian intention to cross the road [J]. IEEE Access, 2020, 8: 72558–72569.
Article Google Scholar
FANG H S, LI J F, TANG H Y, et al. AlphaPose: Whole-body regional multi-person pose estimation and tracking in real-time [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 7157–7173.
Article Google Scholar
OSOKIN D. Real-time 2D multi-person pose estimation on CPU: Lightweight OpenPose [DB/OL]. (2018-11-29) [2023-08-17]. https://arxiv.org/abs/1811.12004
GHORI O, MACKOWIAK R, BAUTISTA M, et al. Learning to forecast pedestrian intention from pose dynamics [C]//2018 IEEE Intelligent Vehicles Symposium. Changshu: IEEE, 2018: 1277–1284.
Google Scholar
HU Y Z, JIANG T, LIU X, et al. Pedestrian-crossing intention-recognition based on dual-stream adaptive graph-convolutional neural-network [J]. Journal of Automotive Safety and Energy, 2022, 13(2): 325–332 (in Chinese).
Google Scholar
LÜ C, CUI G G, MENG X H, et al. Graph representation method for pedestrian intention recognition of intelligent vehicle [J]. Transactions of Bei**g Institute of Technology, 2022, 42(7): 688–695 (in Chinese).
Google Scholar
ZHANG Y F, SUN P Z, JIANG Y, et al. Byte-Track: multi-object tracking by associating every detection box [M]//Computer vision—ECCV 2022. Cham: Springer, 2022: 1–21.
Google Scholar
HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13708–13717.
Google Scholar
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.
Google Scholar
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [M]//Computer vision—ECCV 2018. Cham: Springer, 2018: 3–19.
Chapter Google Scholar
LI C X, LU S B, ZHANG B H, et al. Human-vehicle steering collision avoidance path planning based on pedestrian location prediction [J]. Automotive Engineering, 2021, 43(6): 877–884 (in Chinese).
Google Scholar
NAVEED H, KHAN G, KHAN A U, et al. Human activity recognition using mixture of heterogeneous features and sequential minimal optimization [J]. International Journal of Machine Learning and Cybernetics, 2019, 10(9): 2329–2340.
Article Google Scholar
WANG J, LIU Z C, WU Y, et al. Mining actionlet ensemble for action recognition with depth cameras [C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 1290–1297.
Google Scholar
SUN J H, GE H Y, ZHANG Z H. AS-YOLO: An improved YOLOv4 based on attention mechanism and SqueezeNet for person detection [C]//2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference. Chongqing: IEEE, 2021: 1451–1456.
Google Scholar
LI D P, REN X M, YAN N N. Real-time detection of insulator drop string based on UAV aerial photography [J]. Journal of Shanghai Jiao Tong University, 2022, 56(8): 994–1003 (in Chinese).
Google Scholar
GIRSHICK R. Fast R-CNN [C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440–1448.
Google Scholar
LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [M]//Computer vision—ECCV 2016. Cham: Springer, 2016: 21–37.
Chapter Google Scholar
XING Z W, KAN B, LIU Z S, et al. Airport pavement snow and ice state perception based on improved YOLOX-S [J]. Journal of Shanghai Jiao Tong University, 2023, 57(10): 1292–1304 (in Chinese).
Google Scholar
ZHANG S L, ABDEL-ATY M, WU Y N, et al. Pedestrian crossing intention prediction at red-light using pose estimation [J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(3): 2331–2339.
Article Google Scholar
YAN S J, XIONG Y J, LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 7444–7452.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Mechanical Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
Jushou Lu (陆聚首) & Hao Chen (陈浩)
Shandong Technician College of Transportation, Linyi, Shandong, 276002, China
Yuchuan Bai (柏玉川)
School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, China
Chuan Hu (胡川) & ** Zhang (张希)

Authors

Jushou Lu (陆聚首)
View author publications
You can also search for this author in PubMed Google Scholar
Hao Chen (陈浩)
View author publications
You can also search for this author in PubMed Google Scholar
Yuchuan Bai (柏玉川)
View author publications
You can also search for this author in PubMed Google Scholar
Chuan Hu (胡川)
View author publications
You can also search for this author in PubMed Google Scholar
** Zhang (张希)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Chen (陈浩).

Ethics declarations

Conflict of Interest The authors declare that they have no conflict of interest.

Additional information

Foundation item: the National Natural Science Foundation of China (No. 52302501), and the Natural Science Foundation of Shanghai (No. 21ZR1444500)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, J., Chen, H., Bai, Y. et al. Recognition of Pedestrians’ Street-Crossing Intentions Based on Skeleton Features. J. Shanghai Jiaotong Univ. (Sci.) (2024). https://doi.org/10.1007/s12204-024-2700-9

Download citation

Received: 17 August 2023
Accepted: 07 September 2023
Published: 25 January 2024
DOI: https://doi.org/10.1007/s12204-024-2700-9

Keywords

关键词

CLC number

TP391.9

Document code

A

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Recognition of Pedestrians’ Street-Crossing Intentions Based on Skeleton Features

Abstract

摘要

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Understanding Pedestrians’ Car-Hailing Intention in Traffic Scenes

Local and Global Contextual Features Fusion for Pedestrian Intention Prediction

Comprehensive evaluation of skeleton features-based fall detection from Microsoft Kinect v2

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

关键词

CLC number

Document code

Subscribe and save

Buy Now

Navigation

Recognition of Pedestrians’ Street-Crossing Intentions Based on Skeleton Features

Abstract

摘 要

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Understanding Pedestrians’ Car-Hailing Intention in Traffic Scenes

Local and Global Contextual Features Fusion for Pedestrian Intention Prediction

Comprehensive evaluation of skeleton features-based fall detection from Microsoft Kinect v2

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

关键词

CLC number

Document code

Subscribe and save

Buy Now

Search

Navigation

摘要