Log in

End-to-End ConvNet for Tactile Recognition Using Residual Orthogonal Tiling and Pyramid Convolution Ensemble

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Tactile recognition enables robots identify target objects or environments from tactile sensory readings. The recent advancement of deep learning and biological tactile sensing inspire us proposing an end-to-end architecture ROTConvPCE-mv that performs tactile recognition using residual orthogonal tiling and pyramid convolution ensemble. Our approach uses stacks of raw frames and tactile flow as dual input, and incorporates the strength of multi-layer OTConvs (orthogonal tiling convolutions) organized in a residual learning paradigm. We empirically demonstrate that OTConvs have adjustable invariance capability to different input transformations such as translation, rotation, and scaling. To effectively capture multi-scale global context, a pyramid convolution structure is attached to the concatenated output of two residual OTConv pathways. The extensive experimental evaluations show that ROTConvPCE-mv outperforms several state-of-the-art methods with a large margin regarding recognition accuracy, robustness, and fault-tolerance. Practical suggestions and hints are summarized throughout this paper to facilitate the effective recognition using tactile sensory data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. For visualizing the tactile flow, we use the source code from Sun et al. [46]: http://cs.brown.edu/people/dqsun

  2. Building on the JKSC source code provided by the authors of [23], we parallelized the joint-kernel computation and replaced the code for DTW (Dynamic Time Wrap**) calculation with a compiled C binary: www.mathworks.com/matlabcentral/profile/authors/2797017-quan-wang.

  3. MV/ST-HMP: https://github.com/mmadry/st-hmp

  4. http://vision.ucla.edu/∼doretto/projects/dynamic-recognition.html

  5. Test scores for BoS-LDSs are provided by authors of [5].

References

  1. Sun F, Liu C, Huang W, Zhang J. Object classification and grasp planning using visual and tactile sensing. IEEE Trans Syst Man and Cybernetics: Syst 2016;46(7):969–979.

    Article  Google Scholar 

  2. Kappassov Z, Corrales JA, Perdereau V. Tactile sensing in dexterous robot hands. Robot Auton Syst 2015;74:195–220.

    Article  Google Scholar 

  3. Xu D, Loeb GE, Fishel JA. Tactile identification of objects using bayesian exploration. Proceedings of ICRA; 2013. p. 3056–3061.

  4. **ao W, Sun F, Liu H, He C. Dexterous robotic hand grasp learning using piecewise linear dynamic systems model. Proceedings of ICCSIP; 2014. p. 845–855.

  5. Ma R, Liu H, Sun F, Yang Q, Gao M. Linear dynamic system method for tactile object classification. Sci China Inform Sci 2014;57(12):1–11.

    Google Scholar 

  6. Madry M, Bo L, Kragic D, Fox D. ST-HMP: unsupervised Spatio-temporal feature learning for tactile data. Proceedings of ICRA; 2014. p. 2262–2269.

  7. Spiers AJ, Liarokapis MV, Calli B, Dollar AM. Single-grasp object classification and feature extraction with simple robot hands and tactile sensors. IEEE Trans Haptics 2016;9(2):207–220.

    Article  PubMed  Google Scholar 

  8. Liu H, Greco J, Song X, Bimbo J, Seneviratne L, Althoefer K. Tactile image based contact shape recognition using neural network. Proceedings of MFI; 2012. p. 138–143.

  9. Hoelscher J, Peters J, Hermans T. Evaluation of tactile feature extraction for interactive object recognition. Proceedings of IEEE-RAS 15th international conference on humanoid robots (humanoids). IEEE; 2015. p. 310–317.

  10. Matsubara T, Shibata K. Active tactile exploration with uncertainty and travel cost for fast shape estimation of unknown objects. Robot Auton Syst 2017;91:314–326.

    Article  Google Scholar 

  11. Bekiroglu Y, Laaksonen J, Jorgensen JA, Kyrki V, Kragic D. Assessing grasp stability based on learning and haptic data. IEEE Trans Robot 2011;27(3):616–629.

    Article  Google Scholar 

  12. Dang H, Allen PK. Stable gras** under pose uncertainty using tactile feedback. Auton Robot 2014;36(4): 309–330.

    Article  Google Scholar 

  13. Kwiatkowski J, Cockburn D, Duchaine V. Grasp stability assessment through the fusion of proprioception and tactile signals using convolutional neural networks. Proceedings of IROS. IEEE; 2017. p. 286–292.

  14. Yang H, Liu X, Cao L, Sun F. A new slip-detection method based on pairwise high frequency components of capacitive sensor signals. Proceedings of ICIST; 2015. p. 56–61.

  15. Heyneman B, Cutkosky MR. Slip classification for dynamic tactile array sensors. The Int J Robot Res 2016; 35(4):404–421.

    Article  Google Scholar 

  16. Gorges N, Navarro SE, Goger D, Worn H. Haptic object recognition using passive joints and haptic key features. Proceedings of ICRA; 2010. p. 2349–2355.

  17. Luo S, Mou W, Althoefer K, Liu H. Novel tactile-sift descriptor for object shape recognition. IEEE Sensors J 2015;15(9):5001–5009.

    Article  Google Scholar 

  18. Corradi T, Hall P, Iravani P. Bayesian tactile object recognition: Learning and recognising objects using a new inexpensive tactile sensor. Proceedings of ICRA; 2015. p. 3909–3914.

  19. Bekiroglu Y, Kragic D, Kyrki V. Learning grasp stability based on tactile data and HMMs. Proceedings of RO-MAN; 2010. p. 132–137.

  20. Soh H, Su Y, Demiris Y. Online spatio-temporal gaussian process experts with application to tactile classification. Proceedings of IROS; 2012. p. 4489–4496.

  21. Gogulski J, Boldt R, Savolainen P, Guzmán-López J, Carlson S, Pertovaara A. A segregated neural pathway for prefrontal top-down control of tactile discrimination. Cerebral Cortex (New York, NY: 1991) 2013;25(1):161–166.

    Google Scholar 

  22. Drimus A, Kootstra G, Bilberg A, Kragic D. Design of a flexible tactile sensor for classification of rigid and deformable objects. Robot Auton Syst 2014;62(1):3–15.

    Article  Google Scholar 

  23. Liu H, Guo D, Sun F. Object recognition using tactile measurements: kernel sparse coding methods. IEEE Trans Instrum Meas 2016;65(3):656–665.

    Article  Google Scholar 

  24. Chebotar Y, Hausman K, Su Z, Sukhatme GS, Schaal S. Self-supervised regras** using spatio-temporal tactile features and reinforcement learning. Proceedings of IROS; 2016. p. 1960–1966.

  25. Wu H, Jiang D, Gao H. Tactile motion recognition with convolutional neural networks. Proceedings of IROS; 2017. p. 1572–1577.

  26. Huang W, Sun F, Cao L, Zhao D, Liu H, Harandi M. Sparse coding and dictionary learning with linear dynamical systems. Proceedings of CVPR; 2016. p. 3938–3947.

  27. Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L. Large-scale video classification with convolutional neural networks. Proceedings of CVPR; 2014. p. 1725–1732.

  28. Tu Z, Zheng A, Yang E, Luo B, Hussain A. A biologically inspired vision-based approach for detecting multiple moving objects in complex outdoor scenes. Cognitive Comput 2015;7(5):539–551.

    Article  Google Scholar 

  29. Tu Z, Abel A, Zhang L, Luo B, Hussain A. A new spatio-temporal saliency-based video object segmentation. Cognitive Comput 2016;8(4):629–647.

    Article  Google Scholar 

  30. Tünnermann J, Mertsching B. Region-based artificial visual attention in space and time. Cognitive Comput 2014;6(1):125–143.

    Article  Google Scholar 

  31. Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos. Proceedings of NIPS; 2014. p. 568–576.

  32. Guo D, Sun F, Fang B, Yang C, ** using visual and tactile sensing. Inf Sci 2017;417:274–286.

    Article  Google Scholar 

  33. Cao L, Kotagiri R, Sun F, Li H, Huang W, Aye ZMM. Efficient spatio-temporal tactile object recognition with randomized tiling convolutional networks in a hierarchical fusion strategy. Proceedings of the 30th AAAI; 2016. p. 3337–3345.

  34. Gallace A, Spence C. The cognitive and neural correlates of “tactile consciousness”: a multisensory perspective. Conscious Cogn 2008;17(1):370–407.

    Article  PubMed  Google Scholar 

  35. Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. Proceedings of ECCV; 2014. p. 818–833.

  36. Ngiam J, Chen Z, Chia D, Koh PW, Le QV, Ng AY. Tiled convolutional neural nets. Proceedings of NIPS; 2010. p. 1279–1287.

  37. Lee H, Grosse R, Ranganath R, Ng AY. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. Proceedings of ICML; 2009. p. 609–616.

  38. Gong Y, Wang L, Guo R, Lazebnik S. Multi-scale orderless pooling of deep convolutional activation features. Proceedings of ECCV; 2014. p. 392–407.

  39. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 2017;40(4):834–848.

    Article  PubMed  Google Scholar 

  40. Saxe A, Koh PW, Chen Z, Bhand M, Suresh B, Ng AY. On random weights and unsupervised feature learning. Proceedings of ICML; 2011. p. 1089–1096.

  41. Jarrett K, Kavukcuoglu K, Ranzato M, LeCun Y. What is the best multi-stage architecture for object recognition?. Proceedings of CVPR; 2009. p. 2146–2153.

  42. Pinto N, Doukhan D, DiCarlo JJ, Cox DD. A high-throughput screening approach to discover good forms of biologically inspired visual representation. PLoS Comput Biology 2009;5(11):e1000,579. 1–12.

    Article  CAS  Google Scholar 

  43. Huang GB, Bai Z, Kasun LLC, Vong CM. Local receptive fields based extreme learning machine. IEEE Comput Intell Mag 2015;10(2):18–29.

    Article  Google Scholar 

  44. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 770–778.

  45. Bicchi A, Scilingo EP, Ricciardi E, Pietrini P. Tactile flow explains haptic counterparts of common visual illusions. Brain Res Bull 2008;75(6):737–741.

    Article  PubMed  Google Scholar 

  46. Sun D, Roth S, Black MJ. Secrets of optical flow estimation and their principles. Proceedings of CVPR; 2010. p. 2432–2439.

  47. Horn BK, Schunck BG. Determining optical flow. Artif Intell 1981;17:185–203.

    Article  Google Scholar 

  48. Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. International conference on machine learning; 2015. p. 448–456.

  49. Spratling MW. A hierarchical predictive coding model of object recognition in natural images. Cognitive Comput 2017;9(2):151–167.

    Article  CAS  Google Scholar 

  50. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. Proceedings of CVPR; 2015. p. 1–9.

  51. He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. Proceedings of ECCV; 2014. p. 346–361.

  52. Hengshuang Z, Jian** S, **aojuan Q, **aogang W, Jiaya J. Pyramid scene parsing network. Proceedings of CVPR; 2017. p. 2881–2890.

  53. Liu X, Deng Z. 2018. Segmentation of drivable road using deep fully convolutional residual network with pyramid pooling. Cognitive Comput:1–10.

  54. Hu X, Zhang X, Liu M, Chen Y, Li P, Pei W, Zhang C, Chen H. A flexible capacitive tactile sensor array with micro structure for robotic application. Sci China Info Sci 2014;57(12):1–6.

    Article  Google Scholar 

  55. Zhang J, Cui J, Lu Y, Zhang X, Hu X. A flexible capacitive tactile sensor for manipulator. Proceedings of ICCSIP; 2016. p. 303–309.

  56. Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th ICML; 2010. p. 807–814.

  57. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM international conference on Multimedia. ACM; 2014. p. 675–678.

  58. Scardapane S, Wang D. Randomness in neural networks: an overview. Wiley Interdisciplinary Rev: Data Mining Knowl Discovery 2017;7(2):e1200.

    Google Scholar 

  59. Bo L, Ren X, Fox D. Hierarchical matching pursuit for image classification. Proceedings of NIPS; 2011. p. 2115–2123.

  60. Saisan P, Doretto G, Wu YN, Soatto S. Dynamic texture recognition. Proceedings of CVPR; 2001. p. 58–63.

  61. Johnson BW. Fault-tolerant microprocessor-based sys. IEEE Micro 1984;4(6):6–21.

    Article  Google Scholar 

  62. Cao L, Sun F, Liu X, Huang W, Cheng W, Kotagiri R. Fix-budget and recurrent data mining for online haptic perception. International conference on neural information processing; 2017. p. 581–591.

Download references

Acknowledgements

We thank Weihao Cheng for suggesting momentum prediction [62] during the process of iterating our work. We also express our appreciation to **gwei Yang and Rui Ma for the explanation of JKSC and BoS-LDSs source code. We also express our gratitude to **aohui Hu and Haolin Yang for their help in collecting the HCs10 dataset.

Funding

This work was supported by grants from the China National Natural Science Foundation (Nos. 61327809 and 61210013). Lele Cao is also supported by the State Scholarship Fund under file number 201406210275.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lele Cao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, L., Sun, F., Liu, X. et al. End-to-End ConvNet for Tactile Recognition Using Residual Orthogonal Tiling and Pyramid Convolution Ensemble. Cogn Comput 10, 718–736 (2018). https://doi.org/10.1007/s12559-018-9568-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-018-9568-7

Keywords

Navigation