Log in

Fast training for object recognition with structure-from-motion

  • Image Processing, Analysis, Recognition, and Understanding
  • Published:
Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Abstract

In this paper we present a system for statistical object classification and localization that applies a simplified image acquisition process for the learning phase. Instead of using complex setups to take training images in known poses, which is very time-consuming and not possible for some objects, we use a handheld camera. The pose parameters of objects in all training frames that are necessary for creating the object models are determined using a structure-from-motion algorithm. The local feature vectors we use are derived from wavelet multiresolution analysis. We model the object area as a function of 3D transformations and introduce a background model. Experiments made on a real data set taken with a handheld camera with more than 2500 images show that it is possible to obtain good classification and localization rates using this fast image acquisition method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. C. Chui, An Introduction to Wavelets (Academic Press, San Diego, 1992).

    MATH  Google Scholar 

  2. C. Gräßl, F. Deinzer, and H. Niemann, “Continuous Parametrization of Normal Distribution for Improving the Discrete Statistical Eigenspace Approach for Object Recognition,” in Proceedings of Conference on Pattern Recognition and Information Processing 03, Minsk, May 2003, pp. 73–77.

  3. R. Gross, I. Matthews, and S. Baker, “Appearance-Based Face Recognition and Light Fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence 26(4), 449–465 (2004).

    Article  Google Scholar 

  4. R. Hartley, “Euclidean Reconstruction from Uncalibrated Views,” in Applications of Invariance in Computer Vision, Vol. 825 of Lecture Notes in Computer Science (Springer-Verlag, 1994), pp. 237–256.

  5. B. Heigl, Plenoptic Scence Modeling from Uncalibrated Image Sequences (ibidem-Verlag, Stuttgart, 2004).

    Google Scholar 

  6. J. Kerr and P. Compton, “Toward Generic Model-Based Object Recognition by Knowledge Acquisition and Machine Learning,” in Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, Acapulco, August 2003, pp. 9–15.

  7. S. Mallat, “A Theory for Multiresolution Signal Decomposition: The Wavelet Rrepresentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence 11(7), 674–693 (1989).

    Article  MATH  Google Scholar 

  8. H. Niemann and I. Scholz, “Evaluating the Quality of Light Fields Computed from Handheld Camera Images,” Pattern Recognition Letters 26(3), 239–249 (2005).

    Article  Google Scholar 

  9. S. Park, J. Lee, and S. Kim, “Content-Based Image Classification Using a Neural Network,” Pattern Recognition Letters 25(3), 287–300 (2004).

    Article  Google Scholar 

  10. C. Poelman and T. Kanade, “A Paraperspective Factorization Method for Shape and Motion Recovery,” IEEE Transactions on Pattern Analysis and Machine Intelligence 19(3), 206–218 (1997).

    Article  Google Scholar 

  11. M. Reinhold, Robuste, Probabilistische, Erscheinungsbasierte Objekterkennung (Logos Verlag, Berlin, 2004).

    Google Scholar 

  12. C. Tomasi and T. Kanade, Detection and Tracking of Point Features. Technical Report CMU-CS-91-132 (Carnegie Mellon University, 1991).

  13. A. R. Webb, Statistical Pattern Recognition (John Wiley & Sons Ltd, Chichester, England, 2002).

    MATH  Google Scholar 

  14. T. Zinßer, C. Gräßl, and H. Niemann, “Efficient Feature Tracking for Long Video Sequences,” in Proceedings of 26th DAGM Symposium of Pattern Recognition, Springer-Verlag, August 2004 (to appear).

  15. M. Zobel, J. Denzler, B. Heigl, E. Nöth, D. Paulus, J. Schmidt, and G. Stemmer, “MOBSY: Integration of Vision and Dialogue in Service Robots,” Machine Vision and Applications 14(1), 26–34 (2003).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The text was submitted by the authors in English.

Marcin Grzegorzek, born in 1977, obtained his Master’s Degree in Engineering from the Silesian University of Technology Gliwice (Poland) in 2002. Since December 2002 he has been a PhD candidate and member of the research staff of the Chair for Pattern Recognition at the University of Erlangen-Nuremberg, Germany. His fields are 3D object recognition, statistical modeling, and computer vision. He is an author or coauthor of seven publications.

Michael Reinhold, born in 1969, obtained his degree in Electrical Engineering from RWTH Aachen University, Germany, in 1998. Later, he received a Doctor of Engineering from the University of Erlangen-Nuremberg, Germany, in 2003. His research interests are statistical modeling, object recognition, and computer vision. He is currently a development engineer at Rohde & Schwarz in Munich, Germany, where he works in the Center of Competence for Digital Signal Processing. He is an author or coauthor of 11 publications.

Ingo Scholz, born in 1975, graduated in computer science at the University of Erlangen-Nuremberg, Germany, in 2000 with a degree in Engineering. Since 2001 he has been working as a research staff member at the Institute for Pattern Recognition of the University of Erlangen-Nuremberg. His main research focuses on the reconstruction of light field models, camera calibration techniques, and structure from motion. He is an author or coauthor of ten publications and member of the German Gesellschaft für Informatik (GI).

Heinrich Niemann obtained his Electrical Engineering degree and Doctor of Engineering degree from Hannover Technical University, Germany. He worked with the Fraunhofer Institut für Informationsverarbeitung in Technik und Biologie, Karlsruhe, and with the Fachhochschule Giessen in the Department of Electrical Engineering. Since 1975 he has been a professor of computer science at the University of Erlangen-Nuremberg, where he was dean of the engineering faculty of the university from 1979 to 1981. From 1988 to 2000, he was head of the Knowledge Processing research group at the Bavarian Research Institute for Knowledge-Based Systems (FORWISS). Since 1998, he has been a spokesman for a “special research area” with the name of “Model-Based Analysis and Visualization of Complex Scenes and Sensor Data” funded by the German Research Foundation. His fields of research are speech and image understanding and the application of artificial intelligence techniques in these areas. He is on the editorial board of Signal Processing, Pattern Recognition Letters, Pattern Recognition and Image Analysis, and the Journal of Computing and Information Technology. He is an author or coauthor of seven books and about 400 journal and conference contributions, as well as editor or coeditor of 24 proceedings volumes and special issues. He is a member of DAGM, ISCA, EURASIP, GI, IEEE, and VDE and an IAPR fellow.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grzegorzek, M., Scholz, I., Reinhold, M. et al. Fast training for object recognition with structure-from-motion. Pattern Recognit. Image Anal. 17, 87–92 (2007). https://doi.org/10.1134/S1054661807010105

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1054661807010105

Keywords

Navigation