A Speaker Tracking Algorithm Based on Audio and Visual Information Fusion Using Particle Filter

  • Conference paper
Image Analysis and Recognition (ICIAR 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3212))

Included in the following conference series:

  • 888 Accesses

Abstract

Object tracking by sensor fusion has become an active research area in recent years, but how to fuse various information in an efficient and robust way is still an open problem. This paper presents a new algorithm for tracking speaker based on audio and visual information fusion using particle filter. A closed-loop architecture with reliability of each individual tracker is adopted, and a new method for data fusion and reliability adjustment is proposed. Experiments show the new algorithm is efficient in fusing information and robust to noise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Thailand)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 85.59
Price includes VAT (Thailand)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 99.99
Price excludes VAT (Thailand)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Cutler, R., Rui, Y., Gupta, A., Cadiz, J.J., Tashev, I., He, L.W., Colburn, A., Zhang, Z., Liu, Z., Silverberg, S.: Distributed meetings: A meeting capture and broadcasting system. In: Proc. ACM Conf. on Multimedia, pp. 123–132 (2002)

    Google Scholar 

  2. Rui, Y., He, L., Gupta, A., Liu, Q.: Building an intelligent camera management system. In: Proc. ACM Conf. on Multimedia, pp. 2–11 (2001)

    Google Scholar 

  3. Rui, Y., Florencio, D.: Time delay estimation in the presence of correlated noise and reverberation, Technical Report MSRTR- 2003-01, Microsoft Research Redmond (2003)

    Google Scholar 

  4. Chang, K.C., Chong, C.Y., Bar-Shalom, Y.: Joint probabilistic data association in distributed sensor networks. IEEE Trans. Automat. Contr. 31(10), 889–897 (1986)

    Article  MATH  Google Scholar 

  5. Sherrah, J., Gong, S.: Continuous global evidence-based Bayesian modality fusion for simultaneous tracking of multiple objects. In: Proc. IEEE Int’l Conf. on Computer Vision, pp. 42–49 (2001)

    Google Scholar 

  6. Anderson, B., Moore, J.: Optimal Filtering. Prentice- Hall, Englewood Cliffs (1979)

    MATH  Google Scholar 

  7. Vermaak, J., Blake, A., Gangnet, M., Perez, P.: Sequential Monte Carlo fusion of sound and vision for speaker tracking. In: Proc. IEEE Int’l Conf. on Computer Vision, pp. 741–746 (2001)

    Google Scholar 

  8. Loy, G., Fletcher, L., Apostoloff, N., Zelinsky, A.: An adaptive fusion architecture for target tracking. In: Proc. Int’l Conf. Automatic Face and Gesture Recognition, pp. 261–266 (2002)

    Google Scholar 

  9. Isard, M., Blake, A.: ICONDENSATION: Unifying low-level and high-level tracking in a stochastic framework. In: Proc. European Conf. on Computer Vision, pp. 767–781 (1998)

    Google Scholar 

  10. Chen, Y., Rui, Y.: Speaker Detection Using Particle Filter Sensor Fusion, in Asian Conf. on Computer Vision (2004)

    Google Scholar 

  11. Comaniciu, D., Meer, P.: Kernel-Based Object Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(5) (May 2003)

    Google Scholar 

  12. Merwe, R., Doucet, A., Freitas, N., Wan, E.: The unscented particle Filter, Technical Report CUED/F-INFENG/TR 380, Cambridge University Engineering Department (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, X., Sun, L., Tao, L., Xu, G., Jia, Y. (2004). A Speaker Tracking Algorithm Based on Audio and Visual Information Fusion Using Particle Filter. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2004. Lecture Notes in Computer Science, vol 3212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30126-4_70

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30126-4_70

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23240-7

  • Online ISBN: 978-3-540-30126-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Navigation