Log in

A gated multi-hierarchical feature fusion network for recognizing steel plate surface defects

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Recognizing defects on the steel plate surface has great application potential in the steel manufacturing process. However, it is still challenging to accurately recognize surface defects since most defects only occupy a small area of the whole image and have high similarities to the surrounding backgrounds. To solve the above issues, we propose an attention multi-hierarchical feature fusion network (AMHNet) to recognize defects. First, to better fuse the features from different levels, we propose a skip** attention module to selectively transfer informative features in low-level layers into high-level layers based on the convolutional block attention mechanism. Second, to dynamically fuse multi-hierarchical features, we propose a feature dynamic aggregation gate by gating mechanism to enhance defect-relevant features and suppress useless features. Finally, to verify the effectiveness and advantages of our model, we also collect a new challenging defect recognition dataset called NPU-DRD. Extensive experiments on dataset NPU-DRD show that our AMHNet achieves an accuracy of 97.58% and an AUC score of 97.23%, which are the new state-of-the-art results among existing methods. Our new dataset and source codes are available at https://github.com/Heisenberg828/AMHNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

  1. He, D., Xu, K., Zhou, P.: Defect detection of hot rolled steels with a new object detection framework called classification priority network. Comput. Ind. Eng. 128, 290–297 (2019)

    Article  Google Scholar 

  2. Luo, J., Yang, Z., Li, S., et al.: FPCB surface defect detection: a decoupled two-stage object detection framework. IEEE Trans. Instrum. Meas. 70, 1–11 (2021)

    Google Scholar 

  3. Lin, H., Li, B., Wang, X., et al.: Automated defect inspection of LED chip using deep convolutional neural network. J. Intell. Manuf. 30(6), 2525–2534 (2019)

    Article  Google Scholar 

  4. Wang, P., Sun, X., Diao, W., et al.: FMSSD: Feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 58(5), 3377–3390 (2019)

    Article  Google Scholar 

  5. Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. Proceedings of the IEEE conference on computer vision and pattern recognition. p.2117–2125 (2017)

  6. Cai, Z., Vasconcelos, N.: Cascade R-CNN: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. P. 6154–6162 (2018)

  7. Bakkouri, I., Afdel, K., Benois-Pineau, J.: BG-3DM2F: Bidirectional gated 3D multi-scale feature fusion for Alzheimer’s disease diagnosis[J]. Multimedia Tools Appl. 81(8), 10743–10776 (2022)

    Article  Google Scholar 

  8. **ao, L., Wu, B., Hu, Y.: Missing small fastener detection using deep learning[J]. IEEE Trans. Instrum. Meas. 70, 1–9 (2020)

    Google Scholar 

  9. Fang, H., **a, M., Liu, H., et al.: Automatic zipper tape defect detection using two-stage multi-scale convolutional networks. Neurocomputing 422, 34–50 (2021)

    Article  Google Scholar 

  10. Zeng, W., You, Z., Huang M, et al.: Steel sheet defect detection based on deep learning method. In: 2019 Tenth International Conference on Intelligent Control and Information Processing (ICICIP). IEEE, p. 152–157 (2019)

  11. Zhao, Q., Sheng, T., Wang, Y., et al.: M2det: A single-shot object detector based on multi-level feature pyramid network. Proc. AAAI Conf. Artif. Intellig. 33(1), 9259–9266 (2019)

    MathSciNet  Google Scholar 

  12. Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42(02), 318–327 (2020)

    Article  Google Scholar 

  13. Wu, Y., Chen, Y., Yuan, L., et al.: Rethinking classification and localization for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10186–10195 (2020)

  14. Bakkouri, I., Afdel, K.: Computer-aided diagnosis (CAD) system based on multi-layer feature fusion network for skin lesion recognition in dermoscopy images[J]. Multimedia Tools Appl. 79(29), 20483–20518 (2020)

    Article  Google Scholar 

  15. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778 (2016)

  16. Gao, S.H., Cheng, M.M., Zhao, K., et al.: Res2Net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(02), 652–662 (2021)

    Article  Google Scholar 

  17. He, Y., Song, K., Meng, Q., et al.: An end-to-end steel surface defect detection approach via fusing multiple hierarchical features. IEEE Trans. Instrum. Meas. 69(4), 1493–1504 (2019)

    Article  Google Scholar 

  18. Zhang, J., Kang, X., Ni, H., et al.: Surface defect detection of steel strips based on classification priority YOLOv3-dense network. Ironmaking Steelmaking 48(5), 547–558 (2021)

    Article  Google Scholar 

  19. Dong, H., Song, K., He, Y., et al.: PGA-Net: pyramid feature fusion and global context attention network for automated surface defect detection[J]. IEEE Trans. Industr. Inf. 16(12), 7448–7458 (2019)

    Article  Google Scholar 

  20. Chen, L. C., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European conference on computer vision (ECCV).p. 801–818 (2018)

  21. Song, G., Song, K., Yan, Y.: EDRNet: Encoder–decoder residual network for salient object detection of strip steel surface defects[J]. IEEE Trans. Instrum. Meas. 69(12), 9709–9719 (2020)

    Article  Google Scholar 

  22. Gao, Y., Gao, L., Li, X., et al.: A semi-supervised convolutional neural network-based method for steel surface defect recognition. Robot Comp-Integ Manuf 61, 101825 (2020)

    Article  Google Scholar 

  23. He, Y., Song, K., Dong, H., et al.: Semi-supervised defect classification of steel surface based on multi-training and generative adversarial network[J]. Opt. Lasers Eng. 122, 294–302 (2019)

    Article  Google Scholar 

  24. Woo, S., Park, J., Lee, J. Y., et al.: CBAM: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV). P. 3–19 (2018)

  25. Hu, J., Shen, L., Albanie, S., et al.: Squeeze-and-Excitation Networks[J]. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2019)

    Article  Google Scholar 

  26. Li, X., Wang, W., Hu, X., et al.: Selective kernel networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. p. 510–519 (2019)

  27. Wang, Q., Wu, B., Zhu, P., et al.: ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2020)

  28. Tay, C.P., Roy, S., Yap, K. H. Aanet: Attribute attention network for person re-identifications. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. p. 7134–7143 (2019)

  29. Tao, H., Duan, Q.: Learning discriminative feature representation for estimating smoke density of smoky vehicle rear. IEEE Transact. Intell. Transport. Sys., Early Access (2022). https://doi.org/10.1109/TITS.2022.3198047

    Article  Google Scholar 

  30. Cao, Y., Xu, J., Lin, S., et al. GCnet: Non-local networks meet squeeze-excitation networks and beyond. Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. p. 0–10 (2019)

  31. Huang, Z., Wang, X., Huang, L., et al.: CCnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. p.603–612 (2019)

  32. Fu, J., Liu, J., Tian, H., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p. 3146–3154 (2019)

  33. Tao, H., **e, C., Wang, J., **n, Z.: CENet: a channel-enhanced spatiotemporal network with sufficient supervision information for recognizing industrial smoke emissions. IEEE Internet Things J. 9, 18749–18759 (2022)

    Article  Google Scholar 

  34. Tao, H., Lu, M., Hu, Z., **n, Z., Wang, J.: Attention-aggregated attribute-aware network with redundancy reduction convolution for video-based industrial smoke emission recognition. IEEE Transact. Indust. Informat. 18, 7653–7664 (2021)

    Article  Google Scholar 

  35. Wu, Y., Chen, Y., Yuan, L., et al.: Rethinking classification and localization for object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. p. 10186–10195 (2020)

  36. Zhao, X., Huang, P., Shu, X.: Wavelet-attention CNN for image classification. Multimedia Syst. 28(3), 915–924 (2022)

    Article  Google Scholar 

  37. **a, X., Yang, L., Wei, X., et al.: A multi-scale multi-attention network for dynamic facial expression recognition. Multimedia Syst. 28(2), 479–493 (2022)

    Article  Google Scholar 

  38. Yang, H., Guo, L., Wu, X., et al.: Scale-aware attention-based multi-resolution representation for multi-person pose estimation. Multimedia Syst. 28(1), 57–67 (2022)

    Article  Google Scholar 

  39. **a, H., Zhan, Y., Cheng, K.: Spatial–temporal correlations learning and action-background jointed attention for weakly-supervised temporal action localization. Multimedia Syst. (2022). https://doi.org/10.1007/s00530-022-00912-y

    Article  Google Scholar 

  40. Zhang, R., Shu, X., Yan, R., et al.: Skip-attention encoder–decoder framework for human motion prediction. Multimedia Syst. 28(2), 413–422 (2022)

    Article  Google Scholar 

  41. Sun, Y., Zhao, M., Hu, K., et al.: Visual saliency prediction using multi-scale attention gated network. Multimedia Syst. 28(1), 131–139 (2022)

    Article  Google Scholar 

  42. Cui, L., Jiang, X., Xu, M., et al.: SDDNet: a fast and accurate network for surface defect detection. IEEE Trans. Instrum. Meas. 70, 1–13 (2021)

    Google Scholar 

  43. Zhou, K., Yang, Y., Cavallaro, A., et al.: Learning generalisable omni-scale representations for person re-identification. IEEE Transact. Pattern Anal Mach Intell (2021). https://doi.org/10.1109/TPAMI.2021.3069237

    Article  Google Scholar 

  44. Bao, Y., Song, K., Liu, J., et al.: Triplet-graph reasoning network for few-shot metal generic surface defect segmentation. IEEE Trans. Instrum. Meas. 70, 1–11 (2021)

    Google Scholar 

  45. Özgenel, Ç. F., Sorguç, A. G.: Performance comparison of pretrained convolutional neural networks on crack detection in buildings Isarc. In: Proceedings of the international symposium on automation and robotics in construction. IAARC Publications, p.35: 1–8 (2018)

  46. **e, S., Girshick, R., Dollár, P., et al.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. p. 1492–1500 (2017)

  47. Selvaraju, R. R., Cogswell, M., Das, A., et al.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. p. 618–626 (2017)

  48. Ma, X., Guo, J., Sansom, A., et al.: Spatial pyramid attention for deep convolutional neural networks. IEEE Trans. Multimedia 23, 3048–3058 (2021)

    Article  Google Scholar 

  49. Gao, Z., Yang, G., Li, E., et al.: Novel feature fusion module-based detector for small insulator defect detection. IEEE Sens. J. 21(15), 16807–16814 (2021)

    Article  Google Scholar 

  50. Su, Y., Yan, P., Yi, R., et al.: A cascaded combination method for defect detection of metal gear end-face. J. Manuf. Syst. 63, 439–453 (2022)

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China (No. 62102320), the Key Research and Development Program of Shaanxi Province (No. 2023-ZDLGY-53), and the Fundamental Research Funds for the Central Universities (No. D5000210737).

Author information

Authors and Affiliations

Authors

Contributions

Huanjie Tao and Minghao Lu wrote the main manuscript text. Minghao Lu and Zhenwu Hu prepared all the figures and Tables. All authors reviewed the manuscript.

Corresponding author

Correspondence to Huanjie Tao.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tao, H., Lu, M., Hu, Z. et al. A gated multi-hierarchical feature fusion network for recognizing steel plate surface defects. Multimedia Systems 29, 1347–1360 (2023). https://doi.org/10.1007/s00530-023-01066-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01066-1

Keywords

Navigation