Abstract
Adapting to uncertain environments is a key obstacle in the development of robust robotic object manipulation systems, as there is a trade-off between the computationally expensive methods of handling the surrounding complexity, and the real-time requirement for practical operation. We investigate the use of Deep Learning to develop a real-time scheme on a physical robot. Using a Baxter Research Robot and Kinect sensor, a convolutional neural network (CNN) was trained in a supervised manner to regress gras** coordinates from RGB-D data. Compared to existing methods, regression via deep learning offered an efficient process that learnt generalised gras** features and processed the scene in real-time. The system achieved a successful grasp rate of 62% and a successful detection rate of 78% on a diverse set of physical objects across varying position and orientation, executing grasp detection in 1.8 s on a CPU machine and a complete physical grasp and move in 60 s on the robot.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baxter—redening robotics and manufacturing—rethink robotics. http://www.rethinkrobotics.com/baxter/. Accessed 19 May 2016
Ros.org—powering the world’s robots. http://www.ros.org/. Accessed 22 May 2016
Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: Guyon, I., Dror, G., Lemaire, V., Taylor, G.W., Silver, D.L. (eds.) ICML Unsupervised and Transfer Learning. JMLR Proceedings, vol. 27, pp. 17–36. JMLR.org (2012). http://dblp.uni-trier.de/db/journals/jmlr/jmlrp27.html#Bengio12
Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., Shelhamer, E.: CUDNN: efficient primitives for deep learning. CoRR abs/1410.0759 (2014). http://arxiv.org/abs/1410.0759
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T., Eecs, U.C.B.: Caffe: convolutional architecture for fast feature embedding (2014)
Jiang, Y., Amend, J.R., Lipson, H., Saxena, A.: Learning hardware agnostic grasps for a universal jamming gripper. In: ICRA, pp. 2385–2391. IEEE (2012)
Johns, E., Leutenegger, S., Davison, A.J.: Deep learning a grasp function for gras** under gripper pose uncertainty. CoRR abs/1608.02239 (2016). http://arxiv.org/abs/1608.02239
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances In Neural Information Processing Systems, pp. 1–9 (2012)
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. 34(4–5), 705–724 (2015)
Levine, S., Pastor, P., Krizhevsky, A., Quillen, D.: Learning hand-eye coordination for robotic gras** with deep learning and large-scale data collection. CoRR abs/1603.02199 (2016). http://arxiv.org/abs/1603.02199
Pas, A., Platt, R.: Localizing handle-like grasp affordances in 3D point clouds. In: International Symposium on Experimental Robotics (2014)
Pinto, L., Gupta, A.: Supersizing self-supervision: learning to grasp from 50K tries and 700 robot hours (2015). http://arxiv.org/abs/1509.06825
Popovic, M., Kootstra, G., Jorgensen, J.A., Kragic, D., Kruger, N., Jørgensen, J.A., Krueger, N.: Gras** unknown objects using an Early Cognitive Vision system for general scene understanding. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 987–994 (2011)
Redmon, J., Angelova, A.: Real-time grasp detection using convolutional neural networks. In: Proceedings of IEEE International Conference on Robotics and Automation, vol. 36(2), pp. 1316–1322 (2015)
Saxena, A., Wong, L., Quigley, M., Ng, A.Y.: A vision-based system for gras** novel objects in cluttered environments. In: Kaneko, M., Nakamura, Y. (eds.) Robotics Research, vol. 66. Springer, Heidelberg (2010)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014). http://dl.acm.org/citation.cfm?id=2627435.2670313
Tai, L., Liu, M.: Deep-learning in mobile robotics - from perception to control systems: a survey on why and why not. CoRR abs/1612.07139 (2016). http://arxiv.org/abs/1612.07139
Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., Lipson, H.: Understanding neural networks through deep visualization. In: Deep Learning Workshop, International Conference on Machine Learning (ICML) (2015)
Zivkovic, Z.: Improved adaptive Gaussian mixture model for background subtraction. In: 17th International Conference on Proceedings of the Pattern Recognition, (ICPR 2004), vol. 2, pp. 28–31 (2004). http://dx.doi.org/10.1109/ICPR.2004.479
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
Videos of the gras** system may be found at tiny.cc/birlDeepGrasp. Full training and testing results may be found at tiny.cc/birlTraining and tiny.cc/birlTesting. The ROS-related code may be found at tiny.cc/birlGraspCode.
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Watson, J., Hughes, J., Iida, F. (2017). Real-World, Real-Time Robotic Gras** with Convolutional Neural Networks. In: Gao, Y., Fallah, S., **, Y., Lekakou, C. (eds) Towards Autonomous Robotic Systems. TAROS 2017. Lecture Notes in Computer Science(), vol 10454. Springer, Cham. https://doi.org/10.1007/978-3-319-64107-2_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-64107-2_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64106-5
Online ISBN: 978-3-319-64107-2
eBook Packages: Computer ScienceComputer Science (R0)