Conjoined triple deep network for video anomaly detection

Chang, **ngya; Wu, Yunhe; Deng, Shizhuo; Jia, Tong; Chen, Dongyue

doi:10.1007/s11042-023-17842-0

Conjoined triple deep network for video anomaly detection

Published: 27 December 2023

Volume 83, pages 59491–59518, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

**ngya Chang¹,
Yunhe Wu¹,
Shizhuo Deng¹,
Tong Jia¹ &
…
Dongyue Chen¹

Abstract

The video anomaly detection task typically involves identifying anomalous targets, behaviors, and events in surveillance using only normal samples. Most mainstream anomaly detection models train an encoder-decoder network exclusively with normal samples, identifying frames with larger reconstruction errors as anomalies. The challenge with such methods lies in controlling the generalization ability of the reconstruction model on anomaly samples and the bias of reconstruction maps towards small-scale anomalies. To address these issues, we propose a triple-stream framework for anomaly detection, combining cross-prediction agent tasks and multiple local probabilistic models. We incorporate a dual learning mechanism in both the appearance and motion channels, allowing mutual feedback to make the model overfit to normal samples and correspondingly weaken its generalization on anomalous samples. Additionally, we apply the attention mechanism to the network, design a feature consistency function to constrain bias to local features, and construct a probability model for each local region to detect larger-scale anomalies. Finally, we design a fusion scheme to evaluate anomaly scores for video frames. Evaluations on popular benchmark datasets, including UCSD, Avenue, and Street Scene, demonstrate that our proposed model achieves competitive performance compared to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Spain)

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Deepfake video detection: challenges and opportunities

Article Open access 29 May 2024

Deep Industrial Image Anomaly Detection: A Survey

Article Open access 15 January 2024

Data Availability

The datasets analysed during the current study are available at http://101.32.75.151:8181/dataset/.

Code Availability

The data that support the findings of this study are available at https://github.com/changxingya/code.git.

References

Abati D, Porrello A, Calderara S et al (2019) Latent space autoregression for novelty detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 481–490
Allison PD (1999) Logistic regression using the sas system: theory and application. cary, nc: Sas institute. Inc and John Wiley and Sons. https://doi.org/10.1017/CBO9781107415324.004
Benezeth Y, Jodoin PM, Saligrama V et al (2009) Abnormal events detection based on spatio-temporal co-occurences. In: 2009 IEEE Conference on computer vision and pattern recognition. IEEE, pp 2458–2465. https://doi.org/10.1109/CVPR.2009.5206686
Chang Y, Tu Z, **e W et al (2022) Video anomaly detection with spatio-temporal dissociation. Pattern Recogn 122:108213
Article Google Scholar
Chong YS, Tay YH (2017a) Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks. Springer, pp 189–196. https://doi.org/10.1007/978-3-319-59081-3_23
Chong YS, Tay YH (2017b) Abnormal event detection in videos using spatiotemporal autoencoder. In: International symposium on neural networks. Springer, pp 189–196
Cui Y, Yan L, Cao Z et al (2021) Tf-blender: temporal feature blender for video object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8138–8147
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), IEEE, pp 886–893. https://doi.org/10.1109/CVPR.2005.177
Doshi K, Yilmaz Y (2020) Any-shot sequential anomaly detection in surveillance videos. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 934–935
Dosovitskiy A, Fischer P, Ilg E, et al (2015) Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 2758–2766, https://doi.org/10.1109/ICCV.2015.316
Fan Y, Wen G, Li D et al (2020a) Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder. Comput Vis Image Underst, pp 102920. https://doi.org/10.1016/j.cviu.2020.102920
Fan Y, Wen G, Li D et al (2020) Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder. Comput Vis Image Underst 195:102920
Article Google Scholar
Fradi H, Luvison B, Pham QC (2016) Crowd behavior analysis using local mid-level visual descriptors. IEEE Trans Circuits Syst Video Technol 27(3):589–602
Article Google Scholar
Gong D, Liu L, Le V, et al (2019) Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1705–1714
Goodfellow I, Pouget-Abadie J, Mirza M, et al (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680. http://papers.nips.cc/paper/5423-generative-adversarial-nets
Hao Y, Li J, Wang N et al (2022) Spatiotemporal consistency-enhanced network for video anomaly detection. Pattern Recogn 121:108232
Article Google Scholar
Hasan M, Choi J, Neumann J et al (2016) Learning temporal regularity in video sequences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 733–742. https://doi.org/10.1109/CVPR.2016.86
Ionescu RT, Smeureanu S, Popescu M et al (2019) Detecting abnormal events in video using narrowed normality clusters. In: 2019 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 1951–1960
Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. Adv Neural Inf Process Syst 28
Kim J, Grauman K (2009) Observe locally, infer globally: a space-time mrf for detecting abnormal activities with incremental updates. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 2921–2928. https://doi.org/10.1109/CVPR.2009.5206569
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. ar**v:1412.6980. http://www.oalib.com/paper/4068193
Li Q, Yang R, **ao F et al (2022) Attention-based anomaly detection in multi-view surveillance videos. Knowl-Based Syst 252:109348
Article Google Scholar
Li S, Fang J, Xu H et al (2020) Video frame prediction by deep multi-branch mask network. IEEE Trans Circuits Syst Video Technol 31(4):1283–1295
Article Google Scholar
Liang J, Zhou T, Liu D, et al (2023) Clustseg: clustering for universal segmentation. ar**v:2305.02187
Liu D, Cui Y, Chen Y et al (2020) Video object detection for autonomous driving: motion-aid feature calibration. Neurocomputing 409:1–11
Article Google Scholar
Liu D, Cui Y, Tan W, et al (2021) Sg-net: spatial granularity network for one-stage video instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9816–9825
Liu FT, Ting KM, Zhou ZH (2008) Isolation forest. In: 2008 Eighth IEEE international conference on data mining. IEEE, pp 413–422. https://doi.org/10.1109/ICDM.2008.17
Liu W, Luo W, Lian D et al (2018a) Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6536–6545. https://doi.org/10.1109/CVPR.2018.00684
Liu W, Luo W, Lian D et al (2018b) Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6536–6545
Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 fps in matlab. In: Proceedings of the IEEE international conference on computer vision, pp 2720–2727. https://doi.org/10.1109/ICCV.2013.338
Lu Y, Cao C, Zhang Y et al (2022) Learnable locality-sensitive hashing for video anomaly detection. IEEE Trans Circuits Syst Video Technol
Luo W, Liu W, Gao S (2017a) Remembering history with convolutional lstm for anomaly detection. In: 2017 IEEE International conference on multimedia and expo (ICME). IEEE, pp 439–444. https://doi.org/10.1109/ICME.2017.8019325
Luo W, Liu W, Gao S (2017b) A revisit of sparse coding based anomaly detection in stacked rnn framework. In: Proceedings of the IEEE international conference on computer vision, pp 341–349. https://doi.org/10.1109/ICCV.2017.45
Mahadevan V, Li W, Bhalodia V et al (2010) Anomaly detection in crowded scenes. In: 2010 IEEE Computer society conference on computer vision and pattern recognition. IEEE, pp 1975–1981. https://doi.org/10.1109/CVPR.2010.5539872
Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. In: 2009 IEEE Conference on computer vision and pattern recognition. IEEE, pp 935–942. https://doi.org/10.1109/CVPR.2009.5206641
Mo X, Monga V, Bala R, Fan Z (2013) Adaptive sparse representations for video nomaly detection. IEEE Trans Circuits Syst Video Technol 4(4):631–645
Nguyen MN, Vien NA (2018) Scalable and interpretable one-class svms with deep learning and random fourier features. In: Joint european conference on machine learning and knowledge discovery in databases.Springer, pp 157–172. https://doi.org/10.1007/978-3-030-10925-7_10
Nguyen TN, Meunier J (2019) Hybrid deep network for anomaly detection. ar**v:1908.06347
Park H, Noh J, Ham B (2020) Learning memory-guided normality for anomaly detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14372–14381
Ramachandra B, Jones M (2020) Street scene: a new dataset and evaluation protocol for video anomaly detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2569–2578
Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4161–4170
Ranzato M, Poultney C, Chopra S et al (2007) Efficient learning of sparse representations with an energy-based model. In: Advances in neural information processing systems, pp 1137–1144. https://doi.org/10.7551/mitpress/7503.003.0147
Rao AS, Gubbi J, Rajasegarar S et al (2014) Detection of anomalous crowd behaviour using hyperspherical clustering. In: 2014 International conference on digital image computing: techniques and applications (DICTA). IEEE, pp 1–8. https://doi.org/10.1109/DICTA.2014.7008100
Sabokrou M, Fayyaz M, Fathy M et al (2018) Deep-anomaly: fully convolutional neural network for fast anomaly detection in crowded scenes. Comput Vis Image Underst 172:88–97. https://doi.org/10.1016/j.cviu.2018.02.006
Article Google Scholar
Sabokrou M, Khalooei M, Fathy M et al (2018b) Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3379–3388. https://doi.org/10.1109/CVPR.2018.00356
Sabokrou M, Khalooei M, Fathy M et al (2018c) Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3379–3388
Schölkopf B, Platt JC, Shawe-Taylor J et al (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471. https://doi.org/10.1162/089976601750264965
Article Google Scholar
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6479–6488. https://doi.org/10.1109/CVPR.2018.00678
Tudor Ionescu R, Smeureanu S, Alexe B et al (2017) Unmasking the abnormal events in video. In: Proceedings of the IEEE international conference on computer vision, pp 2895–2903. https://doi.org/10.1109/ICCV.2017.315
Wang T, Snoussi H (2013) Histograms of optical flow orientation for abnormal events detection. In: 2013 IEEE International workshop on performance evaluation of tracking and surveillance (PETS). IEEE, pp 45–52. https://doi.org/10.1109/AVSS.2012.39
Wang W, Han C, Zhou T et al (2022) Visual recognition with deep nearest centroids. ar**v:2209.07383
Woo S, Park J, Lee JY et al (2018) Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Xu D, Yan Y, Ricci E et al (2017) Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput Vis Image Underst 156:117–127. https://doi.org/10.1016/j.cviu.2016.10.010
Article Google Scholar
Yan L, Ma S, Wang Q et al (2022) Video captioning using global-local representation. IEEE Trans Circuits Syst Video Technol 32(10):6642–6656
Article Google Scholar
Yan L, Wang Q, Ma S et al (2022) Solve the puzzle of instance segmentation in videos: a weakly supervised framework with spatio-temporal collaboration. IEEE Trans Circuits Syst Video Technol 33(1):393–406
Article Google Scholar
Zhang D, Gatica-Perez D, Bengio S et al (2005) Semi-supervised adapted hmms for unusual event detection. In: 2005 IEEE Computer society conference on computer vision and pattern recognition (CVPR’05). IEEE, pp 611–618. https://doi.org/10.1109/CVPR.2005.316
Zhang Y, Nie X, He R et al (2020) Normality learning in multispace for video anomaly detection. IEEE Trans Circuits Syst Video Technol 31(9):3694–3706
Article Google Scholar
Zhong Y, Chen X, Hu Y et al (2022) Bidirectional spatio-temporal feature learning with multiscale evaluation for video anomaly detection. IEEE Trans Circuits Syst Video Technol 32(12):8285–8296
Article Google Scholar
Zong B, Song Q, Min MR et al (2018) Deep autoencoding gaussian mixture modelfor unsupervised anomaly detection. In: International conference on learning representations. https://openreview.net/forum?id=BJJLHbb0-

Download references

Funding

This work is supported by the National Natural Science Foundation of China (NSFC) (62202087).

Author information

Authors and Affiliations

College of Information Science and Engineering, Northeastern University, No.11 He** Region Wenhua Street, Shenyang, 110819, China
**ngya Chang, Yunhe Wu, Shizhuo Deng, Tong Jia & Dongyue Chen

Authors

**ngya Chang
View author publications
You can also search for this author in PubMed Google Scholar
Yunhe Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shizhuo Deng
View author publications
You can also search for this author in PubMed Google Scholar
Tong Jia
View author publications
You can also search for this author in PubMed Google Scholar
Dongyue Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by **ngya Chang, Yunhe Wu, Shizhuo Deng, Tong Jia, Dongyue Chen. The first draft of the manuscript was written by **ngya Chang and Yunhe Wu, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Dongyue Chen.

Ethics declarations

Conflicts of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Cite this article

Chang, X., Wu, Y., Deng, S. et al. Conjoined triple deep network for video anomaly detection. Multimed Tools Appl 83, 59491–59518 (2024). https://doi.org/10.1007/s11042-023-17842-0

Download citation

Received: 15 October 2023
Revised: 22 November 2023
Accepted: 07 December 2023
Published: 27 December 2023
Issue Date: June 2024
DOI: https://doi.org/10.1007/s11042-023-17842-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Spain)

Instant access to the full article PDF.

Institutional subscriptions

Conjoined triple deep network for video anomaly detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Deepfake video detection: challenges and opportunities

Deep Industrial Image Anomaly Detection: A Survey

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Conjoined triple deep network for video anomaly detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Deepfake video detection: challenges and opportunities

Deep Industrial Image Anomaly Detection: A Survey

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation