An improved residual learning model and its application to hardware image classification

Zhang, Zhentao; Li, Wenhao; Cheng, Yuxi; Huang, Qingnan; Qiu, Taorong

doi:10.1007/s00371-024-03340-3

An improved residual learning model and its application to hardware image classification

Original article
Published: 09 April 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Zhentao Zhang ORCID: orcid.org/0000-0001-9257-288X¹,
Wenhao Li¹,
Yuxi Cheng¹,
Qingnan Huang¹ &
…
Taorong Qiu¹

68 Accesses
1 Altmetric
Explore all metrics

Abstract

Some hardware is similar in color and shape between different classes, and some hardware varies within a class, thereby decreasing the accuracy of classification models. The baseline of image classification depends on network architectures. To solve this problem, this paper proposes an improved residual network architecture that achieves high accuracy in hardware classification. This paper proposes a hardware dataset that fills the blank of hardware images. The improved residual network architectures are based on depthwise over-parameterized convolution and pyramidal convolution. This model could obtain information from different levels and channels. Moreover, compared with the novel architectures, our model does not require increasing the computation. Experiments on the hardware dataset show that our neural network architectures achieved better accuracy compared with notable architectures such as ResNet50 and DenseNet121. The classification accuracy of the proposed network is 94.6% and the f1 score is 90.26%. To verify the proposed network architecture, we applied our model in fine-grained datasets; our model achieved the best performance in the same training setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computational Operations and Hardware Resource Estimation in a Convolutional Neural Network Architecture

Training Low Bitwidth Model with Weight Normalization for Convolutional Neural Networks

Redundancy-Reduced MobileNet Acceleration on Reconfigurable Logic for ImageNet Classification

Data availability

Due to the nature of this research, participants of this paper did not agree for their data to be shared publicly, so supporting data is not available.

References

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp. 770–778 (2016)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. ar**v preprint ar**v:1704.04861. (2017). https://doi.org/10.48550/ar**v.1704.04861
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 4510–4520 (2018). https://doi.org/10.48550/ar**v.1801.04381
Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R.: Searching for MobileNetV3. In: Proceedings of the 2019 IEEE/CVF International conference on computer vision (ICCV). pp. 1314–1324 (2019)
Huang, G., Liu, Z., Laurens, V., Weinberger, K.Q.: Densely connected convolutional networks. IEEE Comput. Soc. (2017). https://doi.org/10.48550/ar**v.1608.06993
Article Google Scholar
Cao, J., Li, Y., Sun, M., Chen, Y., Lischinski, D., Cohen-Or, D., Chen, B., Tu, C.: DO-Conv: depthwise over-parameterized convolutional layer. IEEE Trans. Image Process. 31, 3726–3736 (2022). https://doi.org/10.1109/TIP.2022.3175432
Article Google Scholar
Cosmin Duta, I., Liu, L., Zhu, F., Shao, L.: Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition: ar**v preprint ar**v:2006.11538. (2020). https://doi.org/10.48550/ar**v.2006.11538
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), Vol 1, pp. 886–893 (2005). https://doi.org/10.1109/CVPR.2005.177
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Gool, T.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008). https://doi.org/10.1016/j.cviu.2007.09.014
Article Google Scholar
Zhu, X., Lu, J., Ren, H., Wang, H., Sun, B.: A transformer–CNN for deep image inpainting forensics. Vis. Comput. 39, 4721–4735 (2023). https://doi.org/10.1007/s00371-022-02620-0
Article Google Scholar
Wang, S., Zhang, S., Zhang, X., Geng, Q.: A two-branch hand gesture recognition approach combining atrous convolution and attention mechanism. Vis. Comput. 39, 4487–4500 (2023). https://doi.org/10.1007/s00371-022-02602-2
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556. (2014). https://doi.org/10.48550/ar**v.1409.1556
Szegedy, C., Wei, L., Jia, Y., Sermanet, P., Rabinovich, A: Going deeper with convolutions. IEEE Comput. Soc. 1–9 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the 2016 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 2818–2826 (2016)
Wang, W., Han, C., Zhou, T., Liu, D.: Visual recognition with deep nearest centroids. ar**v preprint ar**v:2209.07383. (2022). https://doi.org/10.48550/ar**v.2209.07383
Liu, Z., Mao, H., Wu, C. Feichtenhofer, C., Darrell, T., **e, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11976–11986 (2022). https://doi.org/10.48550/ar**v.2201.03545
Jie, H., Li, S., Gang, S. Squeeze-and-excitation networks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 7132–7141 (2018). https://doi.org/10.48550/ar**v.1709.01507
Tan, M., Le, Q.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In: International conference on machine learning. pp. 6105–6114 (2019)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: An Extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 6848–6856 (2018). https://doi.org/10.48550/ar**v.1707.01083
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2017)
Soltanolkotabi, M., Javanmard, A., Lee, J.D.: Theoretical insights into the optimization landscape of over-parameterized shallow neural networks. IEEE Trans. Inf. Theory 65(2), 742–769 (2018). https://doi.org/10.1109/TIT.2018.2854560
Article MathSciNet Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. 448–456 (2015)
Li, J., Xue, Y., Wang, W., Ouyang, G.: Cross-level parallel network for crowd counting. IEEE Trans. Ind. Inform. 16(1), 566–576 (2019). https://doi.org/10.1109/TII.2019.2935244
Article Google Scholar
**ng, H., Wang, S., Zheng, D., Zhao, X.: Dual attention based feature pyramid network. China Commun. 17(8), 242–252 (2020). https://doi.org/10.23919/JCC.2020.08.020
Article Google Scholar
Hu, X., **g, L.: LDPNet: a lightweight densely connected pyramid network for real-time semantic segmentation. IEEE Access. 8, 212647–212658 (2020). https://doi.org/10.1109/ACCESS.2020.3038864
Article Google Scholar
Bi, Q., Qin, K., Li, Z., Zhang, H., **a, G.S.: A multiple-instance densely-connected convnet for aerial scene classification. IEEE Trans. Image Process. 29, 4911–4926 (2020). https://doi.org/10.1109/TIP.2020.2975718
Article Google Scholar
Zagoruyko, S., Komodakis, N.: Wide residual networks. ar**s in deep residual networks. Eur. Conf. Comput. Vis. (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Article Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence, Vol. 31(1), (2017). https://doi.org/10.1609/aaai.v31i1.11231
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. Comput. Sci. (2014). https://doi.org/10.48550/ar**v.1412.6856
Article Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200–2011 dataset. Calif. Inst. Technol. (2011)
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-Grained Visual Classification of Aircraft. HAL—INRIA. (2013). https://doi.org/10.48550/ar**v.1306.5151
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D Object representations for fine-grained categorization. In: Proceedings of the 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia. 554–561 (2013)
Kingma, D.P., Ba, J. Adam: A method for stochastic optimization. In: International conference on learning representations (ICLR). 5, pp. 6 (2015)
**e, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 1492–1500 (2017). https://doi.org/10.48550/ar**v.1611.05431

Download references

Acknowledgements

This work was supported by the Key R&D Program of Jiang** Province of China (Grant Nos.20181BBG70031) and the National Natural Science Foundation of China(62066027).

Author information

Authors and Affiliations

Department of Computer, School of Mathematics and Computer Sciences, Nanchang University, Xuefu Road, Nanchang, 330031, China
Zhentao Zhang, Wenhao Li, Yuxi Cheng, Qingnan Huang & Taorong Qiu

Authors

Zhentao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wenhao Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuxi Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Qingnan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Taorong Qiu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Zhentao Zhang, Wenhao Li and Yuxi Cheng contributed equally and consistently to this manuscript. Qingnan Huang and Taorong Qiu provided useful help to this manuscript. Correspondence should be addressed to Taorong Qiu: qiutaorong@ncu.edu.cn.

Corresponding author

Correspondence to Taorong Qiu.

Ethics declarations

Conflicts of interest

The process of writing and the content of the paper does not give grounds for raising the issue of a conflict of interest.

Ethical approval

This paper is a completely original work of its authors; it has not been published before and will not be sent to other publications until the PRIA editorial board decides not to accept it for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Z., Li, W., Cheng, Y. et al. An improved residual learning model and its application to hardware image classification. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03340-3

Download citation

Accepted: 18 February 2024
Published: 09 April 2024
DOI: https://doi.org/10.1007/s00371-024-03340-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved residual learning model and its application to hardware image classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Computational Operations and Hardware Resource Estimation in a Convolutional Neural Network Architecture

Training Low Bitwidth Model with Weight Normalization for Convolutional Neural Networks

Redundancy-Reduced MobileNet Acceleration on Reconfigurable Logic for ImageNet Classification

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An improved residual learning model and its application to hardware image classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Computational Operations and Hardware Resource Estimation in a Convolutional Neural Network Architecture

Training Low Bitwidth Model with Weight Normalization for Convolutional Neural Networks

Redundancy-Reduced MobileNet Acceleration on Reconfigurable Logic for ImageNet Classification

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation