Abstract
Existing temporal segmentation methods suffer from the problems of high computational complexity and complicated steps. To address this issue, we present a method that combines the binary tree and spatio-temporal tunnel (STT) for temporal segmentation of rough videos. First, we compute initial cumulative spatio-temporal flow to determine flow overflow of sub-video which is divided from a rough video. Second, the decision tree is generated by combining binary tree and balance factor to dynamically adjust the sampling line of the STT. Finally, pixels on the sampling line are extracted to generate an adaptive STT for temporal proposals. Experimental results show that the computational complexity of the proposed method is significantly better than that of the comparison methods while ensuring accuracy.
Similar content being viewed by others
References
PENG J L, ZHAO Y L, WANG L M. Research on video abnormal behavior detection based on deep learning[J]. Laser & optoelectronics progress, 2021, 58(06): 51–61.
ZHANG Z, NIE Y, SUN H, et al. Multi-view video synopsis via simultaneous object-shifting and view-switching optimization[J]. IEEE transactions on image processing, 2020, 29: 971–985.
LI T Y, BING B, WU X X. Boundary discrimination and proposal evaluation for temporal action proposal generation[J]. Multimedia tools and applications, 2021, 80(02): 2123–2139.
AN P, LIANG J X, MA J. LiDAR-camera-system-based 3D object detection with proposal selection and grid attention pooling[J]. Applied optics, 2022, 61(11): 2998–3007.
MURTAZA F, YOUSAF M H, VELASTIN S A. PMHI: proposals from motion history images for temporal segmentation of long uncut videos[J]. IEEE signal processing letters, 2018, 25(02): 179–183.
QU J J, XIN Y H. Combined continuous frame difference with background difference method for moving object detection[J]. Acta photonica sinica, 2014, 43(07): 219–226.
NAWAZ M, YAN H. Saliency detection using deep features and affinity-based robust background subtraction[J]. IEEE transactions on multimedia, 2021, 23(01): 2902–2916.
GUO F, WANG W G, SHEN Z Y, et al. Motion-aware rapid video saliency detection[J]. IEEE transactions on circuits and systems for video technology, 2020, 30(12): 4887–4898.
CONG R, LEI J, FU H, et al. Video saliency detection via sparsity-based reconstruction and propagation[J]. IEEE transactions on image processing, 2019, 28(10): 4819–4831.
HEILBRON F C, NIEBLES J C, GHANEM B. Fast temporal activity proposals for efficient detection of human actions in untrimmed videos[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27–30, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 1914–1923.
WU Q, QUO H, WU X, et al. Fast action localization based on spatio-temporal path search[C]//Proceeding of 2017 IEEE International Conference on Image Processing (ICIP), September 18–20, 2017, Bei**g, China. New York: IEEE, 2017: 3350–3354.
QIU J, WANG L, WANG Y, et al. Efficient proposals: scale estimation for object proposals in pedestrian detection tasks[J]. IEEE signal processing letters, 2020, 27(01): 855–859.
PENG W, SHI J, ZHAO G. Spatial temporal graph deconvolutional network for skeleton-based human sction recognition[J]. IEEE signal processing letters, 2021, 28(01): 244–248.
KUEHNE H, RICHARD A, GALL J. A hybrid RNN-HMM approach for weakly supervised temporal action segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(04): 765–779.
LIU Z, WAQAS M, YANG J, et al. A multi-task CNN for maritime target detection[J]. IEEE signal processing letters, 2021, 28(01): 434–438.
YU G, YUAN J. Fast action proposals for human action detection and search[C]//Proceeding of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 8–10, 2015, Boston, USA. New York: IEEE, 2015: 1302–1311.
CHEN K, WANG J, YANG S, et al. Optimizing video object detection via a scale-time lattice[C]//Proceeding of 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 18–21, 2018, Salt Lake City, UT, USA. New York: IEEE, 2018: 7814–7823.
SHEN Z, LIU Z, LI J, et al. Object detection from scratch with deep supervision[J]. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(02): 398–412.
QU S, ZHANG H, WU W, et al. Symmetric pyramid attention convolutional neural network for moving object detection[J]. Signal, image and video processing, 2021, 15(08): 1747–1755.
ZHANG Y Z, LI W X, YANG P L. Surveillance video motion segmentation based on the progressive spatio-temporal tunnel flow model[J]. Electronics letters, 2021, 57(13): 505–507.
ZHUANG X T. Research on deep learning networks for small object detection based on multi-level feature fusion[D]. Nan**g: Nan**g University of Posts and Telecommunications, 2021.
VEZZANI R, CUCCHIARA R. Video surveillance online repository (VISOR): an integrated framework[J]. Multimedia tools and applications, 2010, 50(01): 359–380.
Author information
Authors and Affiliations
Corresponding author
Additional information
Statements and Declarations
The authors declare that there are no conflicts of interest related to this article.
Document code: A
This work has been supported by the National Natural Science Foundation of China (Nos.61702347 and 62027801), the Natural Science Foundation of Hebei Province (Nos.F2022210007 and F2017210161), the Science and Technology Project of Hebei Education Department (Nos.ZD2022100 and QN2017132), and the Central Guidance on Local Science and Technology Development Fund (No.226Z0501G).
Rights and permissions
About this article
Cite this article
Zhang, Y., Guo, K. Proposals from binary tree and spatio-temporal tunnel for temporal segmentation of rough videos. Optoelectron. Lett. 18, 763–768 (2022). https://doi.org/10.1007/s11801-022-2103-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11801-022-2103-9