Log in

Single-target localization in video sequences using offline deep-ranked metric learning and online learned models updating

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Foreground targets localization in video sequences receives much popularity in computer vision during the past few years, and its studies are highly related toward machine learning techniques. Driven by the recent popular deep learning techniques in machine learning, many contemporary localization studies are equipped with popular deep learning methods, and their performance has been benefited a lot by the prominent generalization capability of deep learning methods. In this study, inspired by deep metric learning, which is a new trend in deep learning, a novel single-target localization method is proposed. This new method is composed of two steps. First, an offline deep-ranked metric learning step is fulfilled and its gradient at the end-to-end learning procedure of the whole deep learning model is derived for realizing the conventional stochastic gradient algorithm. Also, an alternative proximal gradient algorithm is introduced to boost the efficiency as well. Second, an online models updating step is employed by the consecutive updating manner as well as the incremental updating manner, in order to make the offline learned outcome more adaptive during the progression of video sequences, in which challenging circumstances, such as sudden illumination changes, obstacles, shape transformation, complex background, etc., are likely to occur. This new single-target localization method has been compared with several shallow learning-based or deep learning-based localization methods in a large video database. Both qualitative and quantitative analysis have been comprehensively conducted to reveal the superiority of the new single-target localization method from the statistical point of view.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Chapelle O, Scholkopf B, Zien A (2006) Semi-supervised learning, 1st edn. MIT Press, Cambridge

    Book  Google Scholar 

  2. Chen S, Guo C, Lai J (2016) Deep ranking for person re-identification via joint representation learning. IEEE Trans Image Process 25(5):2353–2367

    Article  MathSciNet  Google Scholar 

  3. Chen Y, Li J, **ao H, ** X, Yan S, Feng J (2017) Dual path networks. ar**v:1707.01629

  4. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition

  5. Dong P, Wang W (2016) Better region proposals for pedestrian detection with R-CNN. In: IEEE international conference on visual communications and image processing

  6. Ghahramani Z (2004) Unsupervised learning. Lect Notes Comput Sci 3176:72–112

    Article  Google Scholar 

  7. Girshick R, Donahue J, Darrelland T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38:142–158

    Article  Google Scholar 

  8. Hare S, Saari A, Torr P (2011) Struck: structured output tracking with kernels. In: IEEE international conference on computer vision, pp 263–270

  9. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. ar**v:1512.03385

  10. He K, Zhang X, Ren S, Sun J (2016) Identity map**s in deep residual networks. ar**v:1603.05027

  11. Henriques J, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters: exploiting the circulant structure of tracking-by-detection with kernels. ar**v:1404.7584

  12. Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507

    Article  MathSciNet  Google Scholar 

  13. Hu J, Lu J, Tan Y (2014) Discriminative deep metric learning for face verification in the wild. In: IEEE international conference on computer vision and pattern recognition

  14. Hu J, Lu J, Tan Y, Zhou J (2016) Deep transfer metric learning. IEEE Trans Image Process 25(12):5576–5588

    Article  MathSciNet  Google Scholar 

  15. Hu J, Lu J, Tan Y (2016) Deep metric learning for visual tracking. IEEE Trans Circuits Syst Video Technol 26(11):2056–2068

    Article  Google Scholar 

  16. Kalal K, Matas J (2010) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 6(1):1409–1422

    Google Scholar 

  17. Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems

  18. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE

  19. Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans Multimedia 17(11):1989–1999

    Article  Google Scholar 

  20. Liu Y, Cui J, Zhao H, Zha H (2012) Fusion of low-and high-dimensional approaches by trackers sampling for generic human motion tracking. In: International conference on pattern recognition, pp 898–901

  21. Liu Y, Nie L, Han L, Zhang L, Rosenblum D (2015) Action2activity: recognizing complex activities from sensor data. In: International joint conference on arterial intelligence, pp 1617–1623

  22. Liu H, Ma B, Qin L, Pang J, Zhang C, Huang Q (2015) Set-label modeling and deep metric learning on person re-identification. Neurocomputing 151:1283–1292

    Article  Google Scholar 

  23. Liu Y, Nie L, Liu L, Rosenblum D (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  24. Liu L, Cheng L, Liu Y, Jia Y, Rosenblum D (2016) Recognizing complex activities by a probabilistic interval-based model. In: Proceedings of the association for the advancement of artificial intelligence

  25. Lu J, Wang G, Deng W, Moulin P, Zhou J (2015) Multi-manifold deep metric learning for image set classification. In: IEEE international conference on computer vision and pattern recognition

  26. Monti F, Baroffio L, Bondi L, Tagliasacchi M, Tubaro S (2016) Deep convolutional neural networks for pedestrian detection. Image Commun 47:482–489

    Google Scholar 

  27. Rice J (2007) Mathematical statistics and data analysis, 2nd edn. Duxbury Press, Pacific Grove

    Google Scholar 

  28. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  29. Sabour S, Frosst N, Hinton G (2017) Dynamic routing between capsules. ar**v:1710.09829

  30. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ar**v:1409.1556

  31. Soleimani A, Araabi B, Fouladi K (2016) Deep multi-task metric learning for offline signature verification. Pattern Recogn Lett 80:84–90

    Article  Google Scholar 

  32. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. ar**v:1409.4842

  33. Wu Y, Lim J, Yang M (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848

    Article  Google Scholar 

  34. **e S, Girshick R, Dollar P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. ar**v:1611.05431

  35. **ng E, Ng A, Jordan M, Russell S (2003) Distance metric learning, with application to clustering with side-information. In: Advances in neural information processing systems, pp 505–512

  36. Xu Y, Cui J, Zhao H, Zha H (2012) Tracking generic human motion via fusion of low- and high-dimensional approaches. In: British machine vision conference

  37. Yang L, ** R (2006) Distance metric learning: a comprehensive survey. https://www.cs.cmu.edu/liuy/frame_survey_v2.pdf

  38. Yang H, Shao L, Zheng F, Wang L, Song Z (2011) Recent advances and trends in visual tracking: a review. Neurocomputing 74(18):3823–3831

    Article  Google Scholar 

  39. Yi D, Lei Z, Liao S, Li S (2014) Deep metric learning for person re-identification. In: International conference on pattern recognition

  40. Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv 38:13–58

    Article  Google Scholar 

  41. Yu J, Yang X, Gao F, Tao D (2017) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics 47(12):4014–4024

    Article  Google Scholar 

  42. Zagoruyko S, Komodakis N (2016) Wide residual networks. ar**v:1605.07146

  43. Zhang P, Zhuo T, Huang W, Chen K, Kankanhalli M (2017) Online object tracking based on CNN with spatial-temporal saliency guided sampling. Neurocomputing 257:115–127

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge grants 61403182, 61363046, 61301194, and 61231016 approved by National Natural Science Foundation of China, the key youth grant 20171ACB21017 approved by Natural Science Foundation of Jiangxi Province, the Natural Science Foundation Grant 827/000088 of SZU, the Research Fund for the Doctoral Program of Higher Education of China 20126102120055, as well as the foundation grant from NWPU 3102014JSJ0014 for supporting this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huijun Ding.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, W., Zeng, J., Zhang, P. et al. Single-target localization in video sequences using offline deep-ranked metric learning and online learned models updating. Multimed Tools Appl 77, 28539–28565 (2018). https://doi.org/10.1007/s11042-018-6042-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6042-1

Keywords

Navigation