Abstract
Object tracking by sensor fusion has become an active research area in recent years, but how to fuse various information in an efficient and robust way is still an open problem. This paper presents a new algorithm for tracking speaker based on audio and visual information fusion using particle filter. A closed-loop architecture with reliability of each individual tracker is adopted, and a new method for data fusion and reliability adjustment is proposed. Experiments show the new algorithm is efficient in fusing information and robust to noise.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cutler, R., Rui, Y., Gupta, A., Cadiz, J.J., Tashev, I., He, L.W., Colburn, A., Zhang, Z., Liu, Z., Silverberg, S.: Distributed meetings: A meeting capture and broadcasting system. In: Proc. ACM Conf. on Multimedia, pp. 123–132 (2002)
Rui, Y., He, L., Gupta, A., Liu, Q.: Building an intelligent camera management system. In: Proc. ACM Conf. on Multimedia, pp. 2–11 (2001)
Rui, Y., Florencio, D.: Time delay estimation in the presence of correlated noise and reverberation, Technical Report MSRTR- 2003-01, Microsoft Research Redmond (2003)
Chang, K.C., Chong, C.Y., Bar-Shalom, Y.: Joint probabilistic data association in distributed sensor networks. IEEE Trans. Automat. Contr. 31(10), 889–897 (1986)
Sherrah, J., Gong, S.: Continuous global evidence-based Bayesian modality fusion for simultaneous tracking of multiple objects. In: Proc. IEEE Int’l Conf. on Computer Vision, pp. 42–49 (2001)
Anderson, B., Moore, J.: Optimal Filtering. Prentice- Hall, Englewood Cliffs (1979)
Vermaak, J., Blake, A., Gangnet, M., Perez, P.: Sequential Monte Carlo fusion of sound and vision for speaker tracking. In: Proc. IEEE Int’l Conf. on Computer Vision, pp. 741–746 (2001)
Loy, G., Fletcher, L., Apostoloff, N., Zelinsky, A.: An adaptive fusion architecture for target tracking. In: Proc. Int’l Conf. Automatic Face and Gesture Recognition, pp. 261–266 (2002)
Isard, M., Blake, A.: ICONDENSATION: Unifying low-level and high-level tracking in a stochastic framework. In: Proc. European Conf. on Computer Vision, pp. 767–781 (1998)
Chen, Y., Rui, Y.: Speaker Detection Using Particle Filter Sensor Fusion, in Asian Conf. on Computer Vision (2004)
Comaniciu, D., Meer, P.: Kernel-Based Object Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(5) (May 2003)
Merwe, R., Doucet, A., Freitas, N., Wan, E.: The unscented particle Filter, Technical Report CUED/F-INFENG/TR 380, Cambridge University Engineering Department (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, X., Sun, L., Tao, L., Xu, G., Jia, Y. (2004). A Speaker Tracking Algorithm Based on Audio and Visual Information Fusion Using Particle Filter. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2004. Lecture Notes in Computer Science, vol 3212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30126-4_70
Download citation
DOI: https://doi.org/10.1007/978-3-540-30126-4_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23240-7
Online ISBN: 978-3-540-30126-4
eBook Packages: Springer Book Archive