Lightweight Object Tracking Algorithm Based on Siamese Network with Efficient Attention

  • Conference paper
  • First Online:
Data Mining and Big Data (DMBD 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1453))

Included in the following conference series:

  • 764 Accesses

Abstract

Siamese networks have drawn great attention in visual tracking in recently years because they have a good balance between accuracy and speed. However, in most Siamese trackers, their backbone networks used as feature extractor are relatively shallow and narrow like AlexNet, which does not take full advantage of deep neural networks. In this paper, we propose a lightweight Siamese network object tracking algorithm based on efficient attention mechanism to enhance tracking robustness and accuracy. Firstly, we modify MobileNetV2 and use it as our backbone network, it can reduces the parameters and calculation amount drastically and upgrades the speed of training and testing. Secondly, attention mechanism weighted the feature maps in channels and spatial use for distributing the contribution of the different response maps. Thirdly, different level features are fused for the purpose of obtaining more robust results. The experiments show that our tracker can improve both the accuracy and speed on three benchmarks, including OTB2015, VOT2018 and TrackingNet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  2. Sandler, M., Howard, A., Zhu, M., Zhmogino, A., Chen, L.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 4510–4520. IEEE (2018)

    Google Scholar 

  3. Bolme, D.S., Beverideg, J.R., Draper, B.A.: Visual object tracking using adaptive correlation filters. In: 2010 IEEE computer society Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 2544–2550. IEEE (2010)

    Google Scholar 

  4. Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_50

    Chapter  Google Scholar 

  5. Henriques, J.F., Caseiro, R., Martins, P.: High-speed tracking with kernelized correlation filters. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 583–596 (2015)

    Google Scholar 

  6. Danelljan, M., Bhat, G., Shahbaz, K.F.: ECO: Efficient convolution operators for tracking.In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 6638–6646. IEEE (2017)

    Google Scholar 

  7. Danelljan, M., Hager, G., Shahbaz, K.F.: Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV), New York, pp. 58–66. IEEE (2015)

    Google Scholar 

  8. Ma, C., Huang, J., Yang, X.: Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), New York, pp. 3074–3082. IEEE (2015)

    Google Scholar 

  9. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556 (2014)

  10. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56

    Chapter  Google Scholar 

  11. Guo, Q., Feng, W., Zhou, C.: Learning dynamic siamese network for visual object tracking.In: IEEE International Conference on Computer Vision (ICCV), New York, pp. 1763–1771. IEEE (2017)

    Google Scholar 

  12. He, A., Luo, C., Tian, X.: A twofold Siamese network for real-time object tracking. In: Proceedings of the IEEE International Conference on Computer Vision (CVPR), New York, pp. 4834–4843. IEEE (2018)

    Google Scholar 

  13. Li, B., Yan, J., Wu, W.: High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 8971–8980. IEEE (2018)

    Google Scholar 

  14. Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 103–119. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_7

    Chapter  Google Scholar 

  15. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 7132–7141. IEEE (2018)

    Google Scholar 

  16. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: Convolutional Block Attention Module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., (eds.) Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science, vol. 11211. Springer, Cham (2018)

    Google Scholar 

  17. Wang, Q., Wu, B., Zhu P.: ECA-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 11534 –11542. IEEE (2020)

    Google Scholar 

  18. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 7794–7803. IEEE (2018)

    Google Scholar 

  19. Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: GCnet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV), New York, pp.1971–1980. IEEE (2019)

    Google Scholar 

  20. Howard, A.G., Zhu, M.: Chen, B.: MobileNets efficient convolutional neural networks for mobile vision applications. https://arxiv.org/abs/1704.04861

  21. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 770–778. IEEE (2016)

    Google Scholar 

  22. Li, P., Chen, B., Ouyang, W.: GradNet: gradient-guided network for visual object tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision ICCV(ICCV), New York, pp. 6162–6171. IEEE (2019)

    Google Scholar 

  23. Zhang, Z., Peng, H.: Deeper and wider Siamese networks for real-time visual tracking.In: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 4591–4600. IEEE (2019)

    Google Scholar 

  24. Danelljan, M., Hager, G., Khan, F.S.: Learning spatially regularized correlation filters for visual tracking. In: 2015 IEEE International Conference on Computer Vision(ICCV), New York, pp. 4310–4318. IEEE (2015)

    Google Scholar 

  25. Bertinetto, L., Valmadre, J., Golodetz, S.: Staple: complementary learners for real-time tracking.In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, pp. 1401–1409. IEEE (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qingling Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, H., Liu, Q. (2021). Lightweight Object Tracking Algorithm Based on Siamese Network with Efficient Attention. In: Tan, Y., Shi, Y., Zomaya, A., Yan, H., Cai, J. (eds) Data Mining and Big Data. DMBD 2021. Communications in Computer and Information Science, vol 1453. Springer, Singapore. https://doi.org/10.1007/978-981-16-7476-1_41

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-7476-1_41

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-7475-4

  • Online ISBN: 978-981-16-7476-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation