Abstract
Cognitive robots need to understand their surroundings not only in terms of geometry, but they also need to categorize surfaces, detect objects, estimate their pose, etc. Due to their nature, RGB-D sensors are ideally suited to many of these problems, which is why we developed efficient RGB-D methods to address these tasks. In this chapter, we outline the continuous development and usage of RGB-D methods, spanning three applications: Our cognitive service robot Cosero, which participated with great success in the international RoboCup@Home competitions, an industrial kitting application, and cluttered bin picking for warehouse automation. We learn semantic segmentation using convolutional neural networks and random forests and aggregate the surface category in 3D by RGB-D SLAM. We use deep learning methods to categorize surfaces, to recognize objects and to estimate their pose. Efficient RGB-D registration methods are the basis for the manipulation of known objects. They have been extended to non-rigid registration, which allows for transferring manipulation skills to novel objects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Asfour T, Regenstein K, Azad P, Schroder J, Bierbaum A, Vahrenkamp N, Dillmann R (2006) Armar-III: an integrated humanoid platform for sensory-motor control. In: IEEE-RAS international conference on humanoid robots (humanoids)
Badami I, Stückler J, Behnke S (2013) Depth-enhanced Hough forests for object-class detection and continuous pose estimation. In: ICRA workshop on semantic perception, map** and exploration (SPME)
Badrinarayanan V, Kendall A, Cipolla R (2015) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. ar**v:1511.00561
Bansal A, Russell B, Gupta A (2016) Marr revisited: 2D-3D alignment via surface normal prediction. ar**v:1604.01347
Bäuml B, Schmidt F, Wimböck T, Birbach O, Dietrich A, Fuchs M, Friedl W, Frese U, Borst C, Grebenstein M, Eiberger O, Hirzinger G (2011) Catching flying balls and preparing coffee: Humanoid Rollin’Justin performs dynamic and sensitive tasks. In: IEEE international conference on robotics and automation (ICRA)
Beetz M, Klank U, Kresse I, Maldonado A, Mösenlechner L, Pangercic D, Rühr T, Tenorth M (2011) Robotic roommates making pancakes. In: IEEE-RAS international conference on humanoid robots (Humanoids), pp 529–536
Behnke S (2003) Hierarchical neural networks for image interpretation. Lecture notes in computer science. Springer
Berner A, Li J, Holz D, Stückler J, Behnke S, Klein R (2013) Combining contour and shape primitives for object detection and pose estimation of prefabricated parts. In: IEEE international conference on image processing (ICIP)
Bohren J, Rusu R, Jones E, Marder-Eppstein E, Pantofaru C, Wise M, Mösenlechner L, Meeussen W, Holzer S (2011) Towards autonomous robotic butlers: lessons learned with the PR2. In: IEEE international conference on robotics and automation (ICRA)
Borst C, Wimböck T, Schmidt F, Fuchs M, Brunner B, Zacharias F, Giordano PR, Konietschke R, Sepp W, Fuchs S, et al (2009) Rollin’Justin–mobile platform with variable base. In: IEEE international conference robotics and automation (ICRA)
Choi C, Christensen HI (2016) RGB-D object pose estimation in unstructured environments. Robot Auton Syst 75:595–613
Drost B, Ulrich M, Navab N, Ilic S (2010) Model globally, match locally: efficient and robust 3D object recognition. In: IEEE conference on computer vision and pattern recognition (CVPR)
Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: ICCV
Endres F, Hess J, Sturm J, Cremers D, Burgard W (2014) 3-D map** with an RGB-D camera. IEEE Trans Robot 30(1):177–187
Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: IEEE conference on computer vision and pattern recognition (CVPR)
Fox D (2016) The 100-100 tracking challenge. In: Keynote at ICRA conference
Gall J, Lempitsky VS (2009) Class-specific Hough forests for object detection. In: IEEE conference on computer vision and pattern recognition (CVPR)
Garcia GM, Husain F, Schulz H, Frintrop S, Torras C, Behnke S (2016) Semantic segmentation priors for object discovery. In: International conference on pattern recognition (ICPR)
Girshick RB (2015) Fast R-CNN. In: IEEE international conference on computer vision (ICCV)
Girshick RB, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Gupta S, Hoffman J, Malik J (2016) Cross modal distillation for supervision transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2827–2836
Hermann A, Sun J, Xue Z, Rühl SW, Oberländer J, Roennau A, Zöllner JM, Dillmann R (2013) Hardware and software architecture of the bimanual mobile manipulation robot HoLLiE and its actuated upper body. In: IEEE/ASME international conference on advanced intelligent mechatronics (AIM)
Hermans A, Floros G, Leibe B (2014) Dense 3D semantic map** of indoor scenes from RGB-D images. In: ICRA
Höft N, Schulz H, Behnke S (2014) Fast semantic segmentation of RGB-D scenes with GPU-accelerated deep neural networks. In: German conference on AI
Holz D, Topalidou-Kyniazopoulou A, Stückler J, Behnke S (2015) Real-time object detection, localization and verification for fast robotic depalletizing. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 1459–1466
Husain F, Schulz H, Dellen B, Torras C, Behnke S (2016) Combining semantic and geometric features for object class segmentation of indoor scenes. IEEE Robot Autom Lett 2(1):49–55
Iocchi L, Holz D, Ruiz-del Solar J, Sugiura K, van der Zant T (2015) RoboCup@Home: analysis and results of evolving competitions for domestic and service robots. Artif Intell 229:258–281
Kaess M, Johannsson H, Roberts R, Ila V, Leonard JJ, Dellaert F (2012) iSAM2: incremental smoothing and map** using the Bayes tree. Int J Robot Res 31(2):216–235
Kerl C, Sturm J, Cremers D (2013) Robust odometry estimation for RGB-D cameras. In: IEEE international conference on robotics and automation (ICRA)
Kittmann R, Fröhlich T, Schäfer J, Reiser U, Weißhardt F, Haug A (2015) Let me introduce myself: I am Care-O-bot 4. In: Mensch und computer
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: NIPS, pp 1097–1105
Krueger V, Rovida F, Grossmann B, Petrick R, Crosby M, Charzoule A, Garcia GM, Behnke S, Toscano C, Veiga G (2018) Testing the vertical and cyber-physical integration of cognitive robots in manufacturing. Robot Comput Integr Manufact 57:213–229
Kümmerle R, Grisetti G, Strasdat H, Konolige K, Burgard W (2011) G\({}^{\text{2}}\)o: a general framework for graph optimization. In: IEEE international conference on robotics and automation (ICRA), pp 3607–3613
Leibe B, Leonardis A, Schiele B (2008) Robust object detection with interleaved categorization and segmentation. Int J Comput Vis 77(1–3):259–289
Leidner D, Dietrich A, Schmidt F, Borst C, Albu-Schäffer A (2014) Object-centered hybrid reasoning for whole-body mobile manipulation. In: IEEE international conference on robotics and automation (ICRA)
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: CVPR
Mazuran M, Burgard W, Tipaldi GD (2016) Nonlinear factor recovery for long-term SLAM. Int J Robot Res 35(1–3):50–72
McElhone M, Stückler J, Behnke S (2013) Joint detection and pose tracking of multi-resolution surfel models in RGB-D. In: European conference on mobile robots
Meeussen W, Wise M, Glaser S, Chitta S, McGann, C, Mihelich P, Marder-Eppstein E, Muja M, Eruhimov V, Foote T, Hsu J, Rusu RB, Marthi B, Bradski G, Konolige K, Gerkey BP, Berger E (2010) Autonomous door opening and plugging in with a personal robot. In: IEEE International conference on robotics and automation (ICRA), pp 729–736
Memmesheimer R, Seib V, Paulus D (2017) homer@UniKoblenz: winning team of the RoboCup@Home open platform league 2017. In: Robot world cup. Springer, pp 509–520
Müller AC, Behnke S (2014) Learning depth-sensitive conditional random fields for semantic segmentation of RGB-D images. In: ICRA, pp 6232–6237
Myronenko A, Song X (2010) Point set registration: coherent point drift. IEEE Trans Pattern Anal Mach Intell (PAMI) 32(12):2262–2275
Nieuwenhuisen M, Droeschel D, Holz D, Stückler J, Berner A, Li J, Klein R, Behnke S (2013) Mobile bin picking with an anthropomorphic service robot. In: IEEE international conference on robotics and automation (ICRA)
Papazov C, Haddadin S, Parusel S, Krieger K, Burschka D (2012) Rigid 3D geometry matching for gras** of known objects in cluttered scenes. Int J Robot Res 31(4):538–553
Pavel MS, Schulz H, Behnke S (2015) Recurrent convolutional neural networks for object-class segmentation of RGB-D video. In: International joint conference on neural networks (IJCNN)
Quigley M, Gerkey B, Conley K, Faust J, Foote T, Leibs J, Berger E, Wheeler R, Ng A (2009) ROS: an open-source robot operating system. In: IEEE international conference on robotics and automation (ICRA)
Ren S, He K, Girshick RB, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems (NIPS), pp 91–99
Schulz H, Behnke S (2012) Learning object-class segmentation with convolutional neural networks. In: European symposium on artificial neural networks
Schulz H, Höft N, Behnke S (2015) Depth and height aware semantic RGB-D perception with convolutional neural networks. In: ESANN
Schulz H, Waldvogel B, Sheikh R, Behnke S (2015) CURFIL: random forests for image labeling on GPU. In: International conference on computer vision theory and applications (VISAPP), pp 156–164
Schwarz M, Milan A, Periyasamy AS, Behnke S (2018) RGB-D object detection and semantic segmentation for autonomous manipulation in clutter. Int J Robot Res 37(4–5):437–451
Schwarz M, Schulz H, Behnke S (2015) RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features. In: IEEE international conference on robotics and automation (ICRA), pp 1329–1335
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) OverFeat: integrated recognition, localization and detection using convolutional networks. ar**v:1312.6229
Stoyanov T, Magnusson M, Andreasson H, Lilienthal AJ (2012) Fast and accurate scan registration through minimization of the distance between compact 3D NDT representations. Int J Robot Res 31(12):1377–1393
Stroucken S (2013) Graph-basierte 3D-kartierung von innenräumen mit einem RGBD-multikamera-system. Diplomarbeit, Universität Bonn, Computer Science VI
Stückler J, Behnke S (2013) Hierarchical object discovery and dense modelling from motion cues in RGB-D video. In: International conference artificial intelligence (IJCAI)
Stückler J, Behnke S (2014) Adaptive tool-use strategies for anthropomorphic service robots. In: IEEE-RAS International conference on humanoid robots (Humanoids)
Stückler J, Behnke S (2014) Efficient deformable registration of multi-resolution surfel maps for object manipulation skill transfer. In: IEEE international conference on robotics and automation (ICRA)
Stückler J, Behnke S (2014) Multi-resolution surfel maps for efficient dense 3D modeling and tracking. J Vis Commun Image Represent 25(1):137–147
Stückler J, Behnke S (2015) Efficient dense rigid-body motion segmentation and estimation in RGB-D video. Int J Comput Vis 113(3):233–245
Stückler J, Droeschel D, Gräve K, Holz D, Schreiber M, Topalidou-Kyniazopoulou A, Schwarz M, Behnke S (2014) Increasing flexibility of mobile manipulation and intuitive human-robot interaction in RoboCup@Home. In: RoboCup 2013: robot world cup XVII. Springer, pp 135–146
Stückler J, Schwarz M, Behnke S (2016) Mobile manipulation, tool use, and intuitive interaction for cognitive service robot cosero. Front Robot AI 3:58
Stückler J, Steffens R, Holz D, Behnke S (2013) Efficient 3D object perception and grasp planning for mobile manipulation in domestic environments. Robot Auton Syst 61(10):1106–1115
Stückler J, Waldvogel B, Schulz H, Behnke S (2015) Dense real-time map** of object-class semantics from RGB-D video. J R Time Image Proc 10(4):599–609
Su H, Qi CR, Li Y, Guibas LJ (2015) Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: IEEE international conference on computer vision (ICCV)
Thrun S, Montemerlo M (2006) The graph SLAM algorithm with applications to large-scale map** of urban structures. Int J Robot Res 25(5–6):403–429
van der Burgh M, Lunenburg J, Appeldoorn R, Wijnands R, Clephas T, Baeten M, van Beek L, Ottervanger R, van Rooy H, van de Molengraft M (2017) Tech United Eindhoven @Home 2017 team description paper. University of Technology Eindhoven
Vahrenkamp N, Asfour T, Dillmann R (2012) Simultaneous grasp and motion planning: humanoid robot ARMAR-III. Robot Autom Mag
Wachsmuth S, Lier F, Meyer zu Borgsen S, Kummert J, Lach L, Sixt D (2017) ToBI-team of bielefeld a human-robot interaction system for RoboCup@ home 2017
Whelan T, Kaess M, Johannsson H, Fallon MF, Leonard JJ, McDonald J (2015) Real-time large-scale dense RGB-D SLAM with volumetric fusion. Int J Robot Res 34(4–5):598–626
Whelan T, Leutenegger S, Salas-Moreno R, Glocker B, Davison AJ (2015) ElasticFusion: dense SLAM without a pose graph. In: Robotics: science and systems
Wisspeintner T, van der Zant T, Iocchi L, Schiffer S (2009) RoboCup@Home: scientific competition and benchmarking for domestic service robots. Interact Stud 10(3):392–426
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, **ao J (2015) 3D ShapeNets: a deep representation for volumetric shapes. In: IEEE conference on computer vision and pattern recognition (CVPR)
Zhang J, Singh S (2014) Loam: lidar odometry and map** in real-time. In: Robotics: science and systems conference (RSS), pp 109–111
Acknowledgements
The authors thank the numerous people involved in development and operation of the mentioned robotic systems: Nikita Araslanov, Ishrat Badami, David Droeschel, Germán MartÃn GarcÃa, Kathrin Gräve, Dirk Holz, Jochen Kläß, Christian Lenz, Manus McElhone, Anton Milan, Aura Munoz, Matthias Nieuwenhuisen, Arul Selvam Periyasamy, Michael Schreiber, Sebastian Schüller, David Schwarz, Ricarda Steffens, Jörg Stückler, and Angeliki Topalidou-Kyniazopoulou.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Schwarz, M., Behnke, S. (2019). Semantic RGB-D Perception for Cognitive Service Robots. In: Rosin, P., Lai, YK., Shao, L., Liu, Y. (eds) RGB-D Image Analysis and Processing. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-030-28603-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-28603-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28602-6
Online ISBN: 978-3-030-28603-3
eBook Packages: Computer ScienceComputer Science (R0)