Log in

Improved inception-residual convolutional neural network for object recognition

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Machine learning and computer vision have driven many of the greatest advances in the modeling of Deep Convolutional Neural Networks (DCNNs). Nowadays, most of the research has been focused on improving recognition accuracy with better DCNN models and learning approaches. The recurrent convolutional approach is not applied very much, other than in a few DCNN architectures. On the other hand, Inception-v4 and Residual networks have promptly become popular among computer the vision community. In this paper, we introduce a new DCNN model called the Inception Recurrent Residual Convolutional Neural Network (IRRCNN), which utilizes the power of the Recurrent Convolutional Neural Network (RCNN), the Inception network, and the Residual network. This approach improves the recognition accuracy of the Inception-residual network with same number of network parameters. In addition, this proposed architecture generalizes the Inception network, the RCNN, and the Residual network with significantly improved training accuracy. We have empirically evaluated the performance of the IRRCNN model on different benchmarks including CIFAR-10, CIFAR-100, TinyImageNet-200, and CU3D-100. The experimental results show higher recognition accuracy against most of the popular DCNN models including the RCNN. We have also investigated the performance of the IRRCNN approach against the Equivalent Inception Network (EIN) and the Equivalent Inception Residual Network (EIRN) counterpart on the CIFAR-100 dataset. We report around 4.53, 4.49 and 3.56% improvement in classification accuracy compared with the RCNN, EIN, and EIRN on the CIFAR-100 dataset respectively. Furthermore, the experiment has been conducted on the TinyImageNet-200 and CU3D-100 datasets where the IRRCNN provides better testing accuracy compared to the Inception Recurrent CNN, the EIN, the EIRN, Inception-v3, and Wide Residual Networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems

  2. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  3. Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660

  4. Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: European conference on computer vision. Springer, Berlin, pp 184–199

    Chapter  Google Scholar 

  5. Zhou B, Lapedriza A, **ao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495

  6. Wang N et al (2015) Transferring rich feature hierarchies for robust visual tracking. To further investigate the performance of the proposed IRRCNN model. ar**v preprint ar**v:1501.04587

  7. Mao J et al (2014) Deep captioning with multimodal recurrent neural networks (m-rnn). ar**v preprint ar**v:1412.6632

  8. Shankar S, Garg VK, Cipolla R (2015) Deep-carving: discovering visual attributes by carving deep neural nets. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3403–3412

  9. Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1725–1732

  10. Ballas N et al (2015) Delving deeper into convolutional networks for learning video representations. ar**v preprint ar**v:1511.06432

  11. RojasBarahona Lina Maria (2016) Deep learning for sentiment analysis. Lang Linguist Compass 10(12):701–719

    Article  Google Scholar 

  12. Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning. ACM

  13. Manning CD et al (2014) The Stanford CoreNLP natural language processing toolkit. ACL (System Demonstrations)

  14. Geoffrey Hinton et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97

    Article  Google Scholar 

  15. Mnih V et al (2013) Playing atari with deep reinforcement learning. ar**v preprint ar**v:1312.5602

  16. Lillicrap TP et al (2015) Continuous control with deep reinforcement learning. ar**v preprint ar**v:1509.02971

  17. DiCarlo JJ, Zoccolan D, Rust NC (2012) How does the brain solve visual object recognition? Neuron 73(3):415–434

    Article  Google Scholar 

  18. McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133

    Article  MathSciNet  Google Scholar 

  19. Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  20. Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: CVPR

  21. Fernandez Benito, Parlos AG, Tsai WK (1990) Nonlinear dynamic system identification using artificial neural networks (ANNs). In: 1990 IJCNN international joint conference on neural networks. IEEE

  22. Alom MZ, Hasan M, Yakopcic C, Taha TM (2017) Inception recurrent convolutional neural network for object recognition. ar**v:1704.07709

  23. Szegedy C et al (2016) Inception-v4, Inception-Resnet and the impact of residual connections on learning. ar**v preprint ar**v:1602.07261

  24. He K et al (2016) Identity map**s in deep residual networks. In: European conference on computer vision. Springer, Berlin

    Chapter  Google Scholar 

  25. He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  26. Szegedy C et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  27. LeCun Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  28. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556

  29. Lin M, Chen Q, Yan S (2013) Network in network. ar**v preprint ar**v:1312.4400

  30. Springenberg JT et al (2014) Striving for simplicity: the all convolutional net. ar**v preprint ar**v:1412.6806

  31. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  32. Zagoruyko S, Komodakis N (2016) Wide residual networks. ar**v preprint ar**v:1605.07146

  33. **e S, Girshick R, Dollr P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. ar**v preprint ar**v:1611.05431

  34. Iandola FN et al (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\) 0.5 MB model size. ar**v preprint ar**v:1602.07360

  35. Liao Q, Poggio T (2016) Bridging the gaps between residual learning, recurrent neural networks and visual cortex. ar**v preprint ar**v:1604.03640

  36. O’Reilly RC, Wyatte D, Herd S, Mingus B, Jilk D (2013) Recurrent processing during object recognition. Front Psychol 4(124):1–14

    Google Scholar 

  37. Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical report

  38. Tiny ImageNet (2017) https://tiny-imagenet.herokuapp.com/. Accessed Dec 2017

  39. Ilya Sutskever et al (2013) On the importance of initialization and momentum in deep learning. ICML 3(28):1139–1147

    Google Scholar 

  40. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. ar**v preprint ar**v:1502.03167

  41. Wan L et al (2013) Regularization of neural networks using drop-connect. In: Proceedings of the 30th international conference on machine learning (ICML-13)

  42. Keras CF (2016) https://github.com/fchollet/keras. Accessed Jan 2017

  43. Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow IJ, Bergeron A, Bouchard N, Bengio Y (2012) Theano: new features and speed improvements. NIPS workshop on deep learning and unsupervised feature learning

  44. Mishkin D, Matas J (2015) All you need is a good init. ar**v preprint ar**v:1511.06422

  45. Koushik J, Hayashi H (2016) Improving stochastic gradient descent with feedback. ar**v preprint ar**v:1611.01505

  46. Goodfellow IJ et al (2013) Maxout networks. ICML 3(28):1319–1327

    MathSciNet  Google Scholar 

  47. Lee C-Y et al (2015) Deeply-supervised nets. AISTATS 2(3):562–570

    Google Scholar 

  48. Springenberg JT, Riedmiller M (2014) Improving deep neural networks with probabilistic maxout units. In: International conference on learning representations (ICLR)

  49. Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems (Highway Network)

  50. Stollenga MF et al (2014) Deep networks with internal selective attention through feedback connections. In: Advances in neural information processing systems

  51. Romero A et al (2014) Fitnets: Hints for thin deep nets. ar**v preprint ar**v:1412.6550

  52. Snoek J, Rippel O, Swersky K, Kiros R, Satish N, Sundaram N, Adams RP (2015) Scalable Bayesian optimization using deep neural networks. In: ICML, pp 2171–2180

  53. https://gist.github.com/kashif/0ba0270279a0f38280423754cea2ee1e. Accessed Dec 2017

  54. https://github.com/fchollet/deep-learning-models/releases. Accessed July 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tarek M. Taha.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alom, M.Z., Hasan, M., Yakopcic, C. et al. Improved inception-residual convolutional neural network for object recognition. Neural Comput & Applic 32, 279–293 (2020). https://doi.org/10.1007/s00521-018-3627-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3627-6

Keywords

Navigation