Multimodal identification and localization of users in a smart environment

Salah, Albert Ali; Morros, Ramon; Luque, Jordi; Segura, Carlos; Hernando, Javier; Ambekar, Onkar; Schouten, Ben; Pauwels, Eric

doi:10.1007/s12193-008-0008-y

Multimodal identification and localization of users in a smart environment

Original Paper
Published: 29 May 2008

Volume 2, pages 75–91, (2008)
Cite this article

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Albert Ali Salah¹,
Ramon Morros²,
Jordi Luque²,
Carlos Segura²,
Javier Hernando²,
Onkar Ambekar¹,
Ben Schouten¹ &
…
Eric Pauwels¹

156 Accesses
13 Citations
3 Altmetric
Explore all metrics

Abstract

Detecting the location and identity of users is a first step in creating context-aware applications for technologically-endowed environments. We propose a system that makes use of motion detection, person tracking, face identification, feature-based identification, audio-based localization, and audio-based identification modules, fusing information with particle filters to obtain robust localization and identification. The data streams are processed with the help of the generic client-server middleware SmartFlow, resulting in a flexible architecture that runs across different platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Smart Peephole on the Cloud

Marker-Free Indoor Localization and Tracking of Multiple Users in Smart Environments Using a Camera-Based Approach

Smart Services Using Voice and Images

References

European union. 6th framework integrated project CHIL. URL http://chil.server.de
NIST SmartFlow system. URL http://www.nist.gov/smartspace/nsfs.html
Adami A, Burget L, Dupont S, Garudadri H, Grezl F, Hermansky H, Jain P, Kajarekar S, Morgan N, Sivadas S (2002) Qualcomm-ICSI-OGI features for ASR.In: Procceedings of ICSLP, pp 21–24
Ajmera J, McCowan I, Bourlard H (2002) Robust HMM-based speech/music segmentation. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, vol 1
Anguera X (2005) Beamformit: the robust acoustic beamforming toolkit. URL http://www.icsi.berkeley.edu/~xanguera/beamformit
Anguera X, Wooters C, Hernando J (2006) Robust speaker diarization for meetings: ICSI RT06s evaluation system. In: Proceedings of ICSLP
Barras C, Zhu X, Meignier S, Gauvain J (2004) Improving speaker diarization. In: RT-04F workshop
Bernardin K, Elbs A, Stiefelhagen R (2006) Multiple object tracking performance metrics and evaluation in a smart room environment. In: IEEE international workshop on vision algorithms, pp 53–68
Bimbot F, Bonastre JF, Fredouille C, Gravier G, Magrin-Chagnolleau I, Meignier S, Merlin T, Ortega-García J, Petrovska-Delacrétaz D, Reynolds D (2004) A tutorial of text-independent speaker verification. EURASIP J Appl Signal Process 4:430–451
Article Google Scholar
Black J, Ellis T, Rosin P (2002) Multi-view image surveillance and tracking. In: IEEE workshop on motion and video computing
Carpenter J, Clifford P, Fearnhead P (1999) Improved particle filter for nonlinear problems. IEE Proc Radar Sonar Navig 146(1):2–7
Article Google Scholar
Casas J, Stiefelhagen R (2005) Multi-camera/multi-microphone system design for continuous room monitoring. In: CHIL consortium deliverable D4.1
Checka N, Wilson K, Siracusa M, Darrell T (2004) Multiple person and speaker activity tracking with a particle filter. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’04), vol 5
Chen J, Huang N, Benesty J (2004) An adaptive blind SIMO identification approach to joint multichannel time delay estimation. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, vol 4, pp iv-53–iv-56
Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans ASSP 28:357–366
Article Google Scholar
DiBiase J, Silverman H, Brandstein M (2001) Microphone arrays. Robust localization in reverberant rooms. Springer, Berlin
Google Scholar
Fleuret F, Berclaz J, Lengagne R, Fua P (2008) Multi-camera people tracking with a probabilistic occupancy map. IEEE Trans Pattern Anal Mach Intell 30(2):267–282
Article Google Scholar
Fung G, Mangasarian O (2001) Proximal support vector machine classifiers. In: Proceedings of KDDM, pp 77–86
Gatica-Perez D, Lathoud G, Odobez JM, McCowan I (2007) Audiovisual probabilistic tracking of multiple speakers in meetings. IEEE Trans Audio Speech Lang Process 15(2):601–616
Article Google Scholar
Gordon N, Salmond D, Smith A (1993) Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc Radar Signal Process 140(2):107–113
Article Google Scholar
Haritaoğlu S, Harwood D, Davis L (2000) W4: real-time surveillance of people and their activities. IEEE Trans Pattern Anal Mach Intell 22(8):809–830
Article Google Scholar
Isard M, Blake A (1998) Condensation—conditional density propagation for visual tracking. Int J Comput Vis 29(1):5–28
Article Google Scholar
Kang J, Cohen I, Medioni G (2004) Tracking people in crowded scenes across multiple cameras. In: Asian conference on computer vision
Katsarakis N, Souretis G, Talantzis F, Pnevmatikakis A, Polymenakos L (2007) 3D audiovisual person tracking using Kalman filtering and information theory. In: Lecture notes in computer science, vol 4122. Springer, Berlin, p 45
Google Scholar
Khalaf RY, Intille SS (2001) Improving multiple people tracking using temporal consistency. MIT Dept. of Architecture, House_ n Project Technical Report
Khan Z, Balch T, Dellaert F (2003) Efficient particle filter-based tracking of multiple interacting targets using an MRF-based motion model. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, vol 1, pp 254–259
Kirby M, Sirovich L (1990) Application of the Karhunen-Loeve procedure for the characterization of human faces. IEEE Trans Pattern Anal Mach Intell 12(1):103–108
Article Google Scholar
Luque J, Anguera X, Temko A, Hernando J (2007) Speaker diarization for conference room: the UPC RT07s evaluation system. In: Proceedings of CLEAR. Lecture notes in computer science. Springer, Berlin
Google Scholar
Luque J, Morros R, Garde A, Anguita J, Farrus M, Macho D, Marqués F, Martínez C, Vilaplana V, Hernando J (2006) Audio, video and multimodal person identification in a smart room. In: Proceedings of CLEAR 2006. Lecture notes in computer science, vol 4122. Springer, Berlin
Google Scholar
Mittal A, Davis L (2003) M2tracker: a multi-view approach to segmenting and tracking people in a cluttered scene. Int J Comput Vis 51(3):189–203
Article Google Scholar
Moraru D, Ben M, Gravier G (2005) Experiments on speaker tracking and segmentation in radio broadcast news. In: Ninth European conference on speech communication and technology
Mostefa D et al (2006) CLEAR evaluation plan v1.1. In: http://isl.ira.uka.de/~nickel/clear/downloads/chil-clear-v1.1-2006-02-21.pdf
Nickel K, Gehrig T, Stiefelhagen R, McDonough J (2005) A joint particle filter for audio-visual speaker tracking. In: Proceedings of the 7th international conference on multimodal interfaces pp 61–68
Omologo M, Svaizer P (1997) Use of the crosspower-spectrum phase in acoustic event location. IEEE Trans Speech Audio Process 5(3):288–292
Article Google Scholar
Potamitis I, Tremoulis G, Fakotakis N (2003) Multi-speaker DOA tracking using interactive multiple models and probabilistic data association. In: Proceedings of European conference on speech communication and technology
Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 17(2)
Rabinkin D (1995) A framework for speech source localization using sensor arrays. PhD thesis, Brown University
Reynolds D, Torres-Carrasquillo P (2005) Approaches and applications of audio diarization. In: IEEE international conference on acoustics, speech, and signal processing, vol 5
Salah A, Alpaydın E (2004) Incremental mixtures of factor analyzers. In: International conference on pattern recognition, vol 1, pp 276–279
Schölkopf B, Smola A (2002) Learning with kernels. MIT Press, Cambridge
Google Scholar
Stanford V, Garofolo J, Galibert O, Michel M, Laprun C (2003) The NIST smart space and meeting room projects: signals, acquisition, annotation, and metrics. Proc ICCASP 4:736–739
Google Scholar
Stauffer C, Grimson W (1999) Adaptive background mixture models for real-time tracking. In: Proceedings of the IEEE international conference on computer vision and pattern recognition
Stiefelhagen R, Bernardin K, Bowers R, Garofolo J, Mostefa D, Soundararajan P (2007) The CLEAR 2006 evaluation. In: Proceedings of CLEAR. Lecture notes in computer science. Springer, Berlin, pp 1–44
Google Scholar
Szeder G, Tichy W (2007) A communication middleware for smart room environments. In: Proceedings of the European conference on ambient intelligence. Lecture notes in computer science, vol 4794. Springer, Berlin, pp 195–210
Chapter Google Scholar
Tangelder J, Schouten B (2006) Sparse face representations for face recognition in smart environments. In: International conference on pattern recognition
Temko A, Macho D, Nadeu C (2007) Enhanced SVM training for robust speech activity detection. In: Proceedings of ICCASP
Vilaplana V, Martínez C, Cruz J, Marques F (2006) Face recognition using groups of images in smart room scenarios. In: International conference on image processing (ICIP’06)
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Procedings of the IEEE conference on computer vision and pattern recognition, vol 1, pp 511–518
Wei Niu Long Jiao DH, Wang YF (2003) Real time multi person tracking in video surveillance. In: Pacific rim multimedia conference, Singapore
Wren C, Azarbayejani A, Darrell T, Pentland A (1997) Pfinder: real-time tracking of the human body. IEEE Trans Pattern Anal Mach Intell 19(7):780–785
Article Google Scholar
Zhao T, Nevatia R, Wu B (2008) Segmentation and tracking of multiple humans in crowded environments. IEEE Trans Pattern Anal Mach Intell. doi:10.1109/TPAMI.2007.70770
Google Scholar
Zhou S, Krueger V, Chellappa R (2003) Probabilistic recognition of human faces from video. Comput Vis Image Underst 91(1):214–245
Article Google Scholar
Zotkin D, Duraiswami R, Davis L (2001) Multimodal 3D tracking and event detection via the particle filter. In: IEEE workshop on detection and recognition of events in video, pp 20–27
Zotkin D, Duraiswami R, Davis L (2002) Joint audio-visual tracking using particle filters. EURASIP J Appl Signal Process 2002(11):1154–1164
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Signals and Images, Centre for Mathematics and Computer Science, Amsterdam, The Netherlands
Albert Ali Salah, Onkar Ambekar, Ben Schouten & Eric Pauwels
Technical University of Catalonia, Barcelona, Spain
Ramon Morros, Jordi Luque, Carlos Segura & Javier Hernando

Authors

Albert Ali Salah
View author publications
You can also search for this author in PubMed Google Scholar
Ramon Morros
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Luque
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Segura
View author publications
You can also search for this author in PubMed Google Scholar
Javier Hernando
View author publications
You can also search for this author in PubMed Google Scholar
Onkar Ambekar
View author publications
You can also search for this author in PubMed Google Scholar
Ben Schouten
View author publications
You can also search for this author in PubMed Google Scholar
Eric Pauwels
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Albert Ali Salah.

Additional information

This work is supported by Spanish projects SAPIRE (TEC2007-65470) and PROVEC (TEC2007-66858/TCM) and Dutch projects BRICKS/BSIK and BASIS IOP GenCom.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Salah, A.A., Morros, R., Luque, J. et al. Multimodal identification and localization of users in a smart environment. J Multimodal User Interfaces 2, 75–91 (2008). https://doi.org/10.1007/s12193-008-0008-y

Download citation

Received: 26 December 2007
Accepted: 30 April 2008
Published: 29 May 2008
Issue Date: September 2008
DOI: https://doi.org/10.1007/s12193-008-0008-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal identification and localization of users in a smart environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Smart Peephole on the Cloud

Marker-Free Indoor Localization and Tracking of Multiple Users in Smart Environments Using a Camera-Based Approach

Smart Services Using Voice and Images

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multimodal identification and localization of users in a smart environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Smart Peephole on the Cloud

Marker-Free Indoor Localization and Tracking of Multiple Users in Smart Environments Using a Camera-Based Approach

Smart Services Using Voice and Images

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation