Abstract
Some hardware is similar in color and shape between different classes, and some hardware varies within a class, thereby decreasing the accuracy of classification models. The baseline of image classification depends on network architectures. To solve this problem, this paper proposes an improved residual network architecture that achieves high accuracy in hardware classification. This paper proposes a hardware dataset that fills the blank of hardware images. The improved residual network architectures are based on depthwise over-parameterized convolution and pyramidal convolution. This model could obtain information from different levels and channels. Moreover, compared with the novel architectures, our model does not require increasing the computation. Experiments on the hardware dataset show that our neural network architectures achieved better accuracy compared with notable architectures such as ResNet50 and DenseNet121. The classification accuracy of the proposed network is 94.6% and the f1 score is 90.26%. To verify the proposed network architecture, we applied our model in fine-grained datasets; our model achieved the best performance in the same training setting.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00371-024-03340-3/MediaObjects/371_2024_3340_Fig10_HTML.png)
Similar content being viewed by others
Data availability
Due to the nature of this research, participants of this paper did not agree for their data to be shared publicly, so supporting data is not available.
References
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp. 770–778 (2016)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. ar**v preprint ar**v:1704.04861. (2017). https://doi.org/10.48550/ar**v.1704.04861
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 4510–4520 (2018). https://doi.org/10.48550/ar**v.1801.04381
Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R.: Searching for MobileNetV3. In: Proceedings of the 2019 IEEE/CVF International conference on computer vision (ICCV). pp. 1314–1324 (2019)
Huang, G., Liu, Z., Laurens, V., Weinberger, K.Q.: Densely connected convolutional networks. IEEE Comput. Soc. (2017). https://doi.org/10.48550/ar**v.1608.06993
Cao, J., Li, Y., Sun, M., Chen, Y., Lischinski, D., Cohen-Or, D., Chen, B., Tu, C.: DO-Conv: depthwise over-parameterized convolutional layer. IEEE Trans. Image Process. 31, 3726–3736 (2022). https://doi.org/10.1109/TIP.2022.3175432
Cosmin Duta, I., Liu, L., Zhu, F., Shao, L.: Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition: ar**v preprint ar**v:2006.11538. (2020). https://doi.org/10.48550/ar**v.2006.11538
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), Vol 1, pp. 886–893 (2005). https://doi.org/10.1109/CVPR.2005.177
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94
Gool, T.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008). https://doi.org/10.1016/j.cviu.2007.09.014
Zhu, X., Lu, J., Ren, H., Wang, H., Sun, B.: A transformer–CNN for deep image inpainting forensics. Vis. Comput. 39, 4721–4735 (2023). https://doi.org/10.1007/s00371-022-02620-0
Wang, S., Zhang, S., Zhang, X., Geng, Q.: A two-branch hand gesture recognition approach combining atrous convolution and attention mechanism. Vis. Comput. 39, 4487–4500 (2023). https://doi.org/10.1007/s00371-022-02602-2
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556. (2014). https://doi.org/10.48550/ar**v.1409.1556
Szegedy, C., Wei, L., Jia, Y., Sermanet, P., Rabinovich, A: Going deeper with convolutions. IEEE Comput. Soc. 1–9 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the 2016 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 2818–2826 (2016)
Wang, W., Han, C., Zhou, T., Liu, D.: Visual recognition with deep nearest centroids. ar**v preprint ar**v:2209.07383. (2022). https://doi.org/10.48550/ar**v.2209.07383
Liu, Z., Mao, H., Wu, C. Feichtenhofer, C., Darrell, T., **e, S.: A convnet for the 2020s. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11976–11986 (2022). https://doi.org/10.48550/ar**v.2201.03545
Jie, H., Li, S., Gang, S. Squeeze-and-excitation networks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 7132–7141 (2018). https://doi.org/10.48550/ar**v.1709.01507
Tan, M., Le, Q.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In: International conference on machine learning. pp. 6105–6114 (2019)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: An Extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 6848–6856 (2018). https://doi.org/10.48550/ar**v.1707.01083
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2017)
Soltanolkotabi, M., Javanmard, A., Lee, J.D.: Theoretical insights into the optimization landscape of over-parameterized shallow neural networks. IEEE Trans. Inf. Theory 65(2), 742–769 (2018). https://doi.org/10.1109/TIT.2018.2854560
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. 448–456 (2015)
Li, J., Xue, Y., Wang, W., Ouyang, G.: Cross-level parallel network for crowd counting. IEEE Trans. Ind. Inform. 16(1), 566–576 (2019). https://doi.org/10.1109/TII.2019.2935244
**ng, H., Wang, S., Zheng, D., Zhao, X.: Dual attention based feature pyramid network. China Commun. 17(8), 242–252 (2020). https://doi.org/10.23919/JCC.2020.08.020
Hu, X., **g, L.: LDPNet: a lightweight densely connected pyramid network for real-time semantic segmentation. IEEE Access. 8, 212647–212658 (2020). https://doi.org/10.1109/ACCESS.2020.3038864
Bi, Q., Qin, K., Li, Z., Zhang, H., **a, G.S.: A multiple-instance densely-connected convnet for aerial scene classification. IEEE Trans. Image Process. 29, 4911–4926 (2020). https://doi.org/10.1109/TIP.2020.2975718
Zagoruyko, S., Komodakis, N.: Wide residual networks. ar**s in deep residual networks. Eur. Conf. Comput. Vis. (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence, Vol. 31(1), (2017). https://doi.org/10.1609/aaai.v31i1.11231
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1097–1105 (2012)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene CNNs. Comput. Sci. (2014). https://doi.org/10.48550/ar**v.1412.6856
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200–2011 dataset. Calif. Inst. Technol. (2011)
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-Grained Visual Classification of Aircraft. HAL—INRIA. (2013). https://doi.org/10.48550/ar**v.1306.5151
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D Object representations for fine-grained categorization. In: Proceedings of the 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia. 554–561 (2013)
Kingma, D.P., Ba, J. Adam: A method for stochastic optimization. In: International conference on learning representations (ICLR). 5, pp. 6 (2015)
**e, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the 2018 IEEE/CVF Conference on computer vision and pattern recognition (CVPR). pp. 1492–1500 (2017). https://doi.org/10.48550/ar**v.1611.05431
Acknowledgements
This work was supported by the Key R&D Program of Jiang** Province of China (Grant Nos.20181BBG70031) and the National Natural Science Foundation of China(62066027).
Author information
Authors and Affiliations
Contributions
Zhentao Zhang, Wenhao Li and Yuxi Cheng contributed equally and consistently to this manuscript. Qingnan Huang and Taorong Qiu provided useful help to this manuscript. Correspondence should be addressed to Taorong Qiu: qiutaorong@ncu.edu.cn.
Corresponding author
Ethics declarations
Conflicts of interest
The process of writing and the content of the paper does not give grounds for raising the issue of a conflict of interest.
Ethical approval
This paper is a completely original work of its authors; it has not been published before and will not be sent to other publications until the PRIA editorial board decides not to accept it for publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Z., Li, W., Cheng, Y. et al. An improved residual learning model and its application to hardware image classification. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03340-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s00371-024-03340-3