Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection

Fang, Wenjie; Fan, **n; Hu, Ying

doi:10.1007/978-981-97-0601-3_18

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2006))

Included in the following conference series:

National Conference on Man-Machine Speech Communication

240 Accesses

Abstract

Anomalous sound detection (ASD) is a key technology to identify abnormal sounds in various industries. Self-supervised anomalous sound detection aims at detecting unknown machine anomalous sounds by learning the characteristics of the normal sounds using metainformation. In this paper, we propose a multi-branch network with cross-domain feature fusion (MBN-CFF) for self-supervised ASD task. The multi-branch network splits the complete feature representations and feeds them individually into classifiers to generate category predictions. The weighted loss, calculated by multiple predictions and the real labels, guides the model training process. We also design a cross-domain feature fusion (CFF) block for effectively fusing the time-domain and frequency-domain features and an attentive sandglass (AS) block for effectively extracting features. Experimental results on the DCASE2020 challenge task 2 show that our MBN-CFF network achieves the best performance with the AUC score of 94.73% and pAUC score of 88.74%, respectively, compared to the other five existing methods for anomalous sound detection. The results of ablation experiments show the effectiveness of CFF and AS blocks, multi-brach prediction (MBP).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Chapter: EUR 29.95; Price includes VAT (France)

eBook: EUR 64.19; Price includes VAT (France)

Softcover Book: EUR 79.11; Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Altinors, A., Yol, F., Yaman, O.: A sound based method for fault detection with statistical feature extraction in UAV motors. Appl. Acoust. 183, 108325 (2021)
Article Google Scholar
Chen, H., Ran, L., Sun, X., Cai, C.: SW-WAVENET: learning representation from spectrogram and WaveGram using WaveNet for anomalous sound detection. In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023)
Google Scholar
Chen, S., Liu, Y., Gao, X., Han, Z.: MobileFaceNets: efficient CNNs for accurate real-time face verification on mobile devices. In: Zhou, J., et al. (eds.) CCBR 2018. LNCS, vol. 10996, pp. 428–438. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97909-0_46
Chapter Google Scholar
Crocco, M., Cristani, M., Trucco, A., Murino, V.: Audio surveillance: a systematic review. ACM Comput. Surv. (CSUR) 48(4), 1–46 (2016)
Article Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Ding, X., Guo, Y., Ding, G., Han, J.: ACNet: strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Dohi, K., Endo, T., Purohit, H., Tanabe, R., Kawaguchi, Y.: Flow-based self-supervised density estimation for anomalous sound detection. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 336–340 (2021)
Google Scholar
Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Self-supervised classification for detecting anomalous sounds. In: Detection and Classification of Acoustic Scenes and Events Workshop 2020 (2020)
Google Scholar
Guan, J., **ao, F., Liu, Y., Zhu, Q., Wang, W.: Anomalous sound detection using audio representation with machine id based contrastive learning pretraining. In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023)
Google Scholar
Hayashi, T., Yoshimura, T., Adachi, Y.: Conformer-based id-aware autoencoder for unsupervised anomalous sound detection. DCASE2020 Challenge, Technical report (2020)
Google Scholar
He, T., Shen, L., Guo, Y., Ding, G., Guo, Z.: SECRET: self-consistent pseudo label refinement for unsupervised domain adaptive person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 1, pp. 879–887 (2022)
Google Scholar
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13713–13722 (2021)
Google Scholar
Hu, Y., Zhu, X., Li, Y., Huang, H., He, L.: A multi-grained based attention network for semi-supervised sound event detection. ar**v preprint ar**v:2206.10175 (2022)
Jiang, A., Zhang, W.Q., Deng, Y., Fan, P., Liu, J.: Unsupervised anomaly detection and localization of machine audio: a GAN-based approach. In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10096813
Kapka, S.: ID-conditioned auto-encoder for unsupervised anomaly detection. ar**v preprint ar**v:2007.05314 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ar**v preprint ar**v:1412.6980 (2014)
Koizumi, Y., et al.: Description and discussion on DCASE2020 challenge task2: unsupervised anomalous sound detection for machine condition monitoring. ar**v preprint ar**v:2006.05822 (2020)
Koizumi, Y., Saito, S., Uematsu, H., Harada, N., Imoto, K.: ToyADMOS: a dataset of miniature-machine operating sounds for anomalous sound detection. In: 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 313–317 (2019)
Google Scholar
Liu, Y., Guan, J., Zhu, Q., Wang, W.: Anomalous sound detection using spectral-temporal information fusion. In: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 816–820 (2022)
Google Scholar
Lojka, M., Pleva, M., Kiktová, E., Juhár, J., Čižmár, A.: Efficient acoustic detector of gunshots and glass breaking. Multimed. Tools Appl. 75, 10441–10469 (2016)
Article Google Scholar
Mai, K.T., Davies, T., Griffin, L.D., Benetos, E.: Explaining the decision of anomalous sound detectors. In: Proceedings of the 7th Detection and Classification of Acoustic Scenes and Events 2022 Workshop (DCASE2022), Nancy, France (2022)
Google Scholar
Mori, H., Tamura, S., Hayamizu, S.: Anomalous sound detection based on attention mechanism. In: 2021 29th European Signal Processing Conference (EUSIPCO), pp. 581–585 (2021)
Google Scholar
Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel matters - improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Principi, E., Vesperini, F., Squartini, S., Piazza, F.: Acoustic novelty detection with adversarial autoencoders. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3324–3330 (2017)
Google Scholar
Purohit, H., et al.: MIMII dataset: sound dataset for malfunctioning industrial machine investigation and inspection. ar**v preprint ar**v:1909.09347 (2019)
Ruff, L., et al.: A unifying review of deep and shallow anomaly detection. Proc. IEEE 109(5), 756–795 (2021)
Article Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Suefusa, K., Nishida, T., Purohit, H., Tanabe, R., Endo, T., Kawaguchi, Y.: Anomalous sound detection based on interpolation deep neural network. In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271–275 (2020)
Google Scholar
Suman, A., Kumar, C., Suman, P.: Early detection of mechanical malfunctions in vehicles using sound signal processing. Appl. Acoust. 188, 108578 (2022)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)
Google Scholar
Wan, Q., Huang, Z., Lu, J., Yu, G., Zhang, L.: SeaFormer: Squeeze-enhanced axial transformer for mobile semantic segmentation. ar**v preprint ar**v:2301.13156 (2023)
Wang, H., Zhu, Y., Green, B., Adam, H., Yuille, A., Chen, L.-C.: Axial-DeepLab: stand-alone axial-attention for panoptic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 108–126. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_7
Chapter Google Scholar
Wu, J., Yang, F., Hu, W.: Unsupervised anomalous sound detection for industrial monitoring based on ArcFace classifier and gaussian mixture model. Appl. Acoust. 203, 109188 (2023)
Article Google Scholar
Zeng, X.M., et al.: Joint generative-contrastive representation learning for anomalous sound detection. In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5 (2023)
Google Scholar
Zeng, Y., Liu, H., Xu, L., Zhou, Y., Gan, L.: Robust anomaly sound detection framework for machine condition monitoring. Technical report, DCASE2022 Challenge (2022)
Google Scholar
Zhang, H., Guan, J., Zhu, Q., **ao, F., Liu, Y.: Anomalous sound detection using self-attention-based frequency pattern analysis of machine sounds. ar**v preprint ar**v:2308.14063 (2023)
Zhang, J., et al.: Rethinking mobile block for efficient attention-based models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1389–1400 (2023)
Google Scholar
Zhou, D., Hou, Q., Chen, Y., Feng, J., Yan, S.: Rethinking bottleneck structure for efficient mobile network design. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 680–697. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_40
Chapter Google Scholar

Download references

Acknowledgements

This work is supported by the Multi-lingual Information Technology Research Center of **njiang (ZDI145-21).

Author information

Authors and Affiliations

School of Computer Science and Technology, **njiang University, Urumqi, China
Wenjie Fang, **n Fan & Ying Hu
Key Laboratory of signal detection and processing in **njiang, Urumqi, China
Wenjie Fang, **n Fan & Ying Hu

Authors

Wenjie Fang
View author publications
You can also search for this author in PubMed Google Scholar
**n Fan
View author publications
You can also search for this author in PubMed Google Scholar
Ying Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Hu .

Editor information

Editors and Affiliations

Tsinghua University, Bei**g, China
Jia Jia
University of Science and Technology of China, Anhui, China
Zhenhua Ling
Shanghai Jiao Tong University, Shanghai, China
**e Chen
Bei**g University of Posts and Telecommunications, Bei**g, China
Ya Li
Hunan University, Hunan, China
Zixing Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fang, W., Fan, X., Hu, Y. (2024). Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection. In: Jia, J., Ling, Z., Chen, X., Li, Y., Zhang, Z. (eds) Man-Machine Speech Communication. NCMMSC 2023. Communications in Computer and Information Science, vol 2006. Springer, Singapore. https://doi.org/10.1007/978-981-97-0601-3_18

Download citation

DOI: https://doi.org/10.1007/978-981-97-0601-3_18
Published: 15 February 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0600-6
Online ISBN: 978-981-97-0601-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection