Log in

CAMTrack: a combined appearance-motion method for multiple-object tracking

  • Research
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Object tracking has emerged as an essential process for various applications in the field of computer vision, such as autonomous driving. Recently, object tracking technology has experienced rapid growth, particularly its applications in self-driving vehicles. Tracking systems typically follow the detection-based tracking paradigm, which is affected by the detection results. Although deep learning has led to significant improvements in object detection, data association remains dependent on factors such as spatial location, motion, and appearance, to associate new observations with existing tracks. In this study, we introduce a novel approach called Combined Appearance-Motion Tracking (CAMTrack) to enhance data association by integrating object appearances and their corresponding movements. The proposed tracking method utilizes an appearance-motion model using an appearance-affinity network and an Interactive Multiple Model (IMM). We deploy the appearance model to address the visual affinity between objects across frames and employed the motion model to incorporate motion constraints to obtain robust position predictions under maneuvering movements. Moreover, we also propose a Two-phase association algorithm which is an effective way to recover lost tracks back from previous frames. CAMTrack was evaluated on the widely recognized object tracking benchmarks-KITTI and MOT17. The results showed the superior performance of the proposed method, highlighting its potential to contribute to advances in object tracking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2544–2550, IEEE (2010)

  2. Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2014)

    Article  Google Scholar 

  3. Li, Y., Huang, J., Li, Y., Wang, S., Yang, M.: Proceeding IEEE Conference on Computer Vision and Pattern Recognition (2016)

  4. Butt, A.A., Collins, R.T.: Multi-target tracking by lagrangian relaxation to min-cost network flow. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)

  5. Berclaz, J., Fleuret, F., Turetken, E., Fua, P.: Multiple object tracking using k-shortest paths optimization. IEEE Trans. Pattern Anal. Mach. Intell. 33(9), 1806–1819 (2011). https://doi.org/10.1109/TPAMI.2011.21

    Article  Google Scholar 

  6. Tang, S., Andres, B., Andriluka, M., Schiele, B.: Subgraph decomposition for multi-target tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5033–5041 (2015)

  7. Dehghan, A., Modiri Assari, S., Shah, M.: Gmmcp tracker: globally optimal generalized maximum multi clique problem for multiple object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4091–4099 (2015)

  8. Wu, M., Peng, X.: Motion constraint markov network model for multi-target tracking. In: 2008 International Conference on Audio, Language and Image Processing, pp. 981–987, IEEE (2008)

  9. Milan, A., Schindler, K., Roth, S.: Detection-and trajectory-level exclusion in multiple object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3682–3689 (2013)

  10. Milan, A., Roth, S., Schindler, K.: Continuous energy minimization for multitarget tracking. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 58–72 (2013)

    Article  Google Scholar 

  11. Hoang, H.A., Yoo, M.: 3onet: 3-d detector for occluded object under obstructed conditions. IEEE Sens. J. 23(16), 18879–18892 (2023). https://doi.org/10.1109/JSEN.2023.3293515

    Article  Google Scholar 

  12. Cao, J., Weng, X., Khirodkar, R., Pang, J., Kitani, K.: Observation-centric sort: rethinking sort for robust multi-object tracking. ar**v preprint ar**v:2203.14360 (2022)

  13. Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468, IEEE (2016)

  14. Chaabane, M., Zhang, P., Beveridge, J.R., O’Hara, S.: Deft: detection embeddings for tracking. ar**v preprint ar**v:2102.02267 (2021)

  15. Li, X.R., Jilkov, V.P.: Survey of maneuvering target tracking . Part v. multiple-model methods. IEEE Trans. Aerosp. Electron. Syst. 41(4), 1255–1321 (2005)

    Article  Google Scholar 

  16. Zhao, D., Fu, H., **ao, L., Wu, T., Dai, B.: Multi-object tracking with correlation filter for autonomous vehicle. Sensors 18(7), 2004 (2018)

    Article  Google Scholar 

  17. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37, Springer (2016)

  18. Gündüz, G., Acarman, T.: A lightweight online multiple object vehicle tracking method. In: 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 427–432, IEEE (2018)

  19. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc., (2015)

  20. Zhang, W., Zhou, H., Sun, S., Wang, Z., Shi, J., Loy, C.C.: Robust multi-modality multi-object tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2365–2374 (2019)

  21. Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: Pointpillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)

  22. Gonzalez, N.F., Ospina, A., Calvez, P.: Smat: Smart multiple affinity metrics for multiple object tracking. In: Image Analysis and Recognition: 17th International Conference, ICIAR 2020, Póvoa de Varzim, Portugal, June 24–26, 2020, Proceedings, Part II 17, pp. 48–62, Springer (2020)

  23. Hu, H.-N., Cai, Q.-Z., Wang, D., Lin, J., Sun, M., Krahenbuhl, P., Darrell, T., Yu, F.: Joint monocular 3d vehicle detection and tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5390–5399 (2019)

  24. Marinello, N., Proesmans, M., Van Gool, L.: Triplettrack: 3d object tracking using triplet embeddings and lstm. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4500–4510 (2022)

  25. Wang, S., Cai, P., Wang, L., Liu, M.: Ditnet: end-to-end 3d object detection and track id assignment in spatio-temporal world. IEEE Robot. Autom. Lett. 6(2), 3397–3404 (2021)

    Article  Google Scholar 

  26. Wang, X., Fu, C., Li, Z., Lai, Y., He, J.: Deepfusionmot: a 3d multi-object tracking framework based on camera-lidar fusion with deep association. IEEE Robot. Autom. Lett. 7(3), 8260–8267 (2022)

    Article  Google Scholar 

  27. Ren, J., Chen, X., Liu, J., Sun, W., Pang, J., Yan, Q., Tai, Y.-W., Xu, L.: Accurate single stage detector using recurrent rolling convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5420–5428 (2017)

  28. Wang, X., Fu, C., He, J., Wang, S., Wang, J.: Strongfusionmot: a multi-object tracking method based on lidar-camera fusion. IEEE Sens. J. (2022). https://doi.org/10.1109/JSEN.2022.3226490

    Article  Google Scholar 

  29. Hu, H.-N., Yang, Y.-H., Fischer, T., Darrell, T., Yu, F., Sun, M.: Monocular quasi-dense 3d object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 1992–2008 (2022)

    Article  Google Scholar 

  30. Rangesh, A., Maheshwari, P., Gebre, M., Mhatre, S., Ramezani, V., Trivedi, M.M.: Trackmpnn: A message passing graph neural architecture for multi-object tracking. ar**v preprint ar**v:2101.04206 (2021)

  31. Wang, G., Gu, R., Liu, Z., Hu, W., Song, M., Hwang, J.-N.: Track without appearance: learn box and tracklet embedding with local and global motion patterns for vehicle tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9876–9886 (2021)

  32. Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. ar**v preprint ar**v:1904.07850 (2019)

  33. Kim, A., Ošep, A., Leal-Taixé, L.: Eagermot: 3d multi-object tracking via sensor fusion. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 11315–11321, IEEE (2021)

  34. Reich, A., Wuensche, H.-J.: Monocular 3d multi-object tracking with an ekf approach for long-term stable tracks. In: 2021 IEEE 24th International Conference on Information Fusion (FUSION), pp. 1–7, IEEE (2021)

  35. Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: European Conference on Computer Vision, pp. 474–490, Springer (2020)

  36. Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: exceeding yolo series in 2021. ar**v preprint ar**v:2107.08430 (2021)

  37. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  38. Duong, M.-T., Lee, S., Hong, M.-C.: Dmt-net: deep multiple networks for low-light image enhancement based on retinex model. IEEE Access 11, 132147–132161 (2023)

    Article  Google Scholar 

  39. Yazdian-Dehkordi, M., Azimifar, Z.: Adaptive visual target detection and tracking using incremental appearance learning. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1041–1045, IEEE (2015)

  40. Karunasekera, H., Wang, H., Zhang, H.: Multiple object tracking with attention to appearance, structure, motion and size. IEEE Access 7, 104423–104434 (2019)

    Article  Google Scholar 

  41. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556 (2014)

  42. Chu, P., Ling, H.: Famnet: joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6172–6181 (2019)

  43. Sun, S., Akhtar, N., Song, H., Mian, A., Shah, M.: Deep affinity network for multiple object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 104–119 (2019)

    Google Scholar 

  44. Sadeghian, A., Alahi, A., Savarese, S.: Tracking the untrackable: learning to track multiple cues with long-term dependencies. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 300–311 (2017)

  45. Zhou, Q., Zhong, B., Zhang, Y., Li, J., Fu, Y.: Deep alignment network based multi-person tracking with occlusion and motion reasoning. IEEE Trans. Multimed. 21(5), 1183–1194 (2018)

    Article  Google Scholar 

  46. Yoon, K., Kim, D.Y., Yoon, Y.-C., Jeon, M.: Data association for multi-object tracking via deep neural networks. Sensors 19(3), 559 (2019)

    Article  Google Scholar 

  47. Xu, B., Liang, D., Li, L., Quan, R., Zhang, M.: An effectively finite-tailed updating for multiple object tracking in crowd scenes. Appl. Sci. 12(3), 1061 (2022)

    Article  Google Scholar 

  48. Mahmoudi, N., Ahadi, S.M., Rahmati, M.: Multi-target tracking using cnn-based features: Cnnmtt. Multimed. Tools Appl. 78(6), 7077–7096 (2019)

    Article  Google Scholar 

  49. Lan, L., Wang, X., Zhang, S., Tao, D., Gao, W., Huang, T.S.: Interacting tracklets for multi-object tracking. IEEE Trans. Image Process. 27(9), 4585–4597 (2018)

    Article  MathSciNet  Google Scholar 

  50. Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 749–765, Springer (2016)

  51. **ao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 466–481 (2018)

  52. **ang, J., Zhang, G., Hou, J.: Online multi-object tracking based on feature representation and Bayesian filtering within a deep learning architecture. IEEE Access 7, 27923–27935 (2019)

    Article  Google Scholar 

  53. Wu, H., Wen, C., Shi, S., Li, X., Wang, C.: Virtual sparse convolution for multimodal 3d object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21653–21662 (2023)

  54. Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logist. Q. 2(1–2), 83–97 (1955)

    Article  MathSciNet  Google Scholar 

  55. Zhang, Y., Sun, P., Jiang, Y., Yu, D., Yuan, Z., Luo, P., Liu, W., Wang, X.: Bytetrack: multi-object tracking by associating every detection box. ar**v preprint ar**v:2110.06864 (2021)

  56. Chu, P., Fan, H., Tan, C.C., Ling, H.: Online multi-object tracking with instance-aware tracker and dynamic model refreshment. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 161–170, IEEE (2019)

  57. Yu, F., Wang, D., Shelhamer, E., Darrell, T.: Deep layer aggregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2403–2412 (2018)

  58. Chaabane, M., Gueguen, L., Trabelsi, A., Beveridge, R., O’Hara, S.: End-to-end learning improves static object geo-localization in monocular video. ar**v preprint ar**v:2004.05232 (2020)

  59. Luiten, J., Osep, A., Dendorfer, P., Torr, P.H.S., Geiger, A., Leal-Taixé, L., Leibe, B.: HOTA: A higher order metric for evaluating multi-object tracking. CoRR abs/2009.07736 (2020) 2009.07736

  60. Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. (2008). https://doi.org/10.1155/2008/246309

    Article  Google Scholar 

  61. Ristani, E., Solera, F., Zou, R.S., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. CoRR abs/1609.01775 (2016) 1609.01775

  62. Jiang, C., Wang, Z., Liang, H., Tan, S.: A fast and high-performance object proposal method for vision sensors: application to object detection. IEEE Sens. J. 22(10), 9543–9557 (2022)

    Article  Google Scholar 

  63. Choi, W.: Near-online multi-target tracking with aggregated local flow descriptor. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3029–3037 (2015)

  64. Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 941–951 (2019)

  65. Liu, Q., Chu, Q., Liu, B., Yu, N.: Gsm: graph similarity model for multi-object tracking. In: IJCAI, pp. 530–536 (2020)

  66. Brasó, G., Leal-Taixé, L.: Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6247–6257 (2020)

Download references

Acknowledgements

This research was supported by the National Research Foundation of Korea (NRF) grant funded by the Government of South Korea (MSIT)(NRF-2021R1A2B5B01002559).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Myungsik Yoo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bui, D.C., Nguyen, N.L., Hoang, A.H. et al. CAMTrack: a combined appearance-motion method for multiple-object tracking. Machine Vision and Applications 35, 62 (2024). https://doi.org/10.1007/s00138-024-01548-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-024-01548-w

Keywords

Navigation