Log in

An improved residual learning model and its application to hardware image classification

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Some hardware is similar in color and shape between different classes, and some hardware varies within a class, thereby decreasing the accuracy of classification models. The baseline of image classification depends on network architectures. To solve this problem, this paper proposes an improved residual network architecture that achieves high accuracy in hardware classification. This paper proposes a hardware dataset that fills the blank of hardware images. The improved residual network architectures are based on depthwise over-parameterized convolution and pyramidal convolution. This model could obtain information from different levels and channels. Moreover, compared with the novel architectures, our model does not require increasing the computation. Experiments on the hardware dataset show that our neural network architectures achieved better accuracy compared with notable architectures such as ResNet50 and DenseNet121. The classification accuracy of the proposed network is 94.6% and the f1 score is 90.26%. To verify the proposed network architecture, we applied our model in fine-grained datasets; our model achieved the best performance in the same training setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

Due to the nature of this research, participants of this paper did not agree for their data to be shared publicly, so supporting data is not available.

References

  1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp. 770–778 (2016)

  2. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. ar**v preprint ar**v:1704.04861. (2017). https://doi.org/10.48550/ar**v.1704.04861

  3. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 4510–4520 (2018). https://doi.org/10.48550/ar**v.1801.04381

  4. Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R.: Searching for MobileNetV3. In: Proceedings of the 2019 IEEE/CVF International conference on computer vision (ICCV). pp. 1314–1324 (2019)

  5. Huang, G., Liu, Z., Laurens, V., Weinberger, K.Q.: Densely connected convolutional networks. IEEE Comput. Soc. (2017). https://doi.org/10.48550/ar**v.1608.06993

    Article  Google Scholar 

  6. Cao, J., Li, Y., Sun, M., Chen, Y., Lischinski, D., Cohen-Or, D., Chen, B., Tu, C.: DO-Conv: depthwise over-parameterized convolutional layer. IEEE Trans. Image Process. 31, 3726–3736 (2022). https://doi.org/10.1109/TIP.2022.3175432

    Article  Google Scholar 

  7. Cosmin Duta, I., Liu, L., Zhu, F., Shao, L.: Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition: ar**v preprint ar**v:2006.11538. (2020). https://doi.org/10.48550/ar**v.2006.11538

  8. Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), Vol 1, pp. 886–893 (2005). https://doi.org/10.1109/CVPR.2005.177

  9. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  10. Gool, T.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008). https://doi.org/10.1016/j.cviu.2007.09.014

    Article  Google Scholar 

  11. Zhu, X., Lu, J., Ren, H., Wang, H., Sun, B.: A transformer–CNN for deep image inpainting forensics. Vis. Comput. 39, 4721–4735 (2023). https://doi.org/10.1007/s00371-022-02620-0

    Article  Google Scholar 

  12. Wang, S., Zhang, S., Zhang, X., Geng, Q.: A two-branch hand gesture recognition approach combining atrous convolution and attention mechanism. Vis. Comput. 39, 4487–4500 (2023). https://doi.org/10.1007/s00371-022-02602-2

    Article  Google Scholar 

  13. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556. (2014). https://doi.org/10.48550/ar**v.1409.1556

  14. Szegedy, C., Wei, L., Jia, Y., Sermanet, P., Rabinovich, A: Going deeper with convolutions. IEEE Comput. Soc. 1–9 (2015)

  15. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the 2016 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 2818–2826 (2016)

  16. Wang, W., Han, C., Zhou, T., Liu, D.: Visual recognition with deep nearest centroids. ar**v preprint ar**v:2209.07383. (2022). https://doi.org/10.48550/ar**v.2209.07383

  17. Liu, Z., Mao, H., Wu, C. Feichtenhofer, C., Darrell, T., **e, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11976–11986 (2022). https://doi.org/10.48550/ar**v.2201.03545

  18. Jie, H., Li, S., Gang, S. Squeeze-and-excitation networks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 7132–7141 (2018). https://doi.org/10.48550/ar**v.1709.01507

  19. Tan, M., Le, Q.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In: International conference on machine learning. pp. 6105–6114 (2019)

  20. Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: An Extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 6848–6856 (2018). https://doi.org/10.48550/ar**v.1707.01083

  21. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2017)

  22. Soltanolkotabi, M., Javanmard, A., Lee, J.D.: Theoretical insights into the optimization landscape of over-parameterized shallow neural networks. IEEE Trans. Inf. Theory 65(2), 742–769 (2018). https://doi.org/10.1109/TIT.2018.2854560

    Article  MathSciNet  Google Scholar 

  23. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. 448–456 (2015)

  24. Li, J., Xue, Y., Wang, W., Ouyang, G.: Cross-level parallel network for crowd counting. IEEE Trans. Ind. Inform. 16(1), 566–576 (2019). https://doi.org/10.1109/TII.2019.2935244

    Article  Google Scholar 

  25. **ng, H., Wang, S., Zheng, D., Zhao, X.: Dual attention based feature pyramid network. China Commun. 17(8), 242–252 (2020). https://doi.org/10.23919/JCC.2020.08.020

    Article  Google Scholar 

  26. Hu, X., **g, L.: LDPNet: a lightweight densely connected pyramid network for real-time semantic segmentation. IEEE Access. 8, 212647–212658 (2020). https://doi.org/10.1109/ACCESS.2020.3038864

    Article  Google Scholar 

  27. Bi, Q., Qin, K., Li, Z., Zhang, H., **a, G.S.: A multiple-instance densely-connected convnet for aerial scene classification. IEEE Trans. Image Process. 29, 4911–4926 (2020). https://doi.org/10.1109/TIP.2020.2975718

    Article  Google Scholar 

  28. Zagoruyko, S., Komodakis, N.: Wide residual networks. ar**s in deep residual networks. Eur. Conf. Comput. Vis. (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Article  Google Scholar 

  29. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence, Vol. 31(1), (2017). https://doi.org/10.1609/aaai.v31i1.11231

  30. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)

    Google Scholar 

  31. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. Comput. Sci. (2014). https://doi.org/10.48550/ar**v.1412.6856

    Article  Google Scholar 

  32. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200–2011 dataset. Calif. Inst. Technol. (2011)

  33. Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-Grained Visual Classification of Aircraft. HAL—INRIA. (2013). https://doi.org/10.48550/ar**v.1306.5151

  34. Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D Object representations for fine-grained categorization. In: Proceedings of the 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia. 554–561 (2013)

  35. Kingma, D.P., Ba, J. Adam: A method for stochastic optimization. In: International conference on learning representations (ICLR). 5, pp. 6 (2015)

  36. **e, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 1492–1500 (2017). https://doi.org/10.48550/ar**v.1611.05431

Download references

Acknowledgements

This work was supported by the Key R&D Program of Jiang** Province of China (Grant Nos.20181BBG70031) and the National Natural Science Foundation of China(62066027).

Author information

Authors and Affiliations

Authors

Contributions

Zhentao Zhang, Wenhao Li and Yuxi Cheng contributed equally and consistently to this manuscript. Qingnan Huang and Taorong Qiu provided useful help to this manuscript. Correspondence should be addressed to Taorong Qiu: qiutaorong@ncu.edu.cn.

Corresponding author

Correspondence to Taorong Qiu.

Ethics declarations

Conflicts of interest

The process of writing and the content of the paper does not give grounds for raising the issue of a conflict of interest.

Ethical approval

This paper is a completely original work of its authors; it has not been published before and will not be sent to other publications until the PRIA editorial board decides not to accept it for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Li, W., Cheng, Y. et al. An improved residual learning model and its application to hardware image classification. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03340-3

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03340-3

Keywords

Navigation