Abstract
With the advent of deep learning algorithms, the field of portable musical instrument recognition, i.e., musical recognition using mobile devices, has experienced substantial progress. Manual labeling, which is time-consuming, labor-intensive, and error-prone, has historically been used to classify instruments. Recent research, however, has concentrated on automating the classification process through the extraction of music properties. Nonetheless, due to the complicated interplay between the fundamental wave and harmonics in music, identifying important audio information remains difficult. This article describes the underlying ideas and implementation approach of portable musical instrument identification based on acoustic characteristics in detail. This paper proposes utilizing the Learning Vector Quantization (LVQ) neural network learning technique to extract acoustic components from music sources using the Short-Time Fourier Transform (STFT). In addition, this paper uses a feature selection strategy to pick the most informative features, lowering the dimensionality of the classifier’s feature vector and improving training and recognition efficiency. The weighted recognition accuracy is 79.8% when all characteristics are picked, according to the experimental results. However, by decreasing the number of feature dimensions to 24, the system obtains its greatest weighted recognition rate of 81.2%, outperforming the performance with all features enabled by 1.3%. This illustrates how feature dimensionality reduction may increase recognition performance. However, decreasing the feature dimensions beyond 24 resulted in worse recognition accuracy, demonstrating the existence of an ideal feature dimensionality for each portable musical instrument category. A feature vector with 24 dimensions produces the greatest results for piano recognition, whereas a vector with 20 dimensions offers the maximum accuracy for cello recognition. These findings highlight the significance of feature selection in obtaining high accuracy rates for certain instrument types.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11036-023-02174-y/MediaObjects/11036_2023_2174_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11036-023-02174-y/MediaObjects/11036_2023_2174_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11036-023-02174-y/MediaObjects/11036_2023_2174_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11036-023-02174-y/MediaObjects/11036_2023_2174_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11036-023-02174-y/MediaObjects/11036_2023_2174_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11036-023-02174-y/MediaObjects/11036_2023_2174_Fig6_HTML.png)
Similar content being viewed by others
Data availability
The labeled dataset used to support the findings of this study is available from the corresponding author upon request.
References
Wu Y, Sheng H, Zhang Y, Wang S, **ong Z,…, Ke W (2022) Hybrid motion model for multiple object tracking in mobile devices. IEEE Internet of things Journal. https://doi.org/10.1109/JIOT.2022.3219627
Krishna AG, Sreenivas T (2004) Music instrument recognition: from isolated notes to solo phrases, presented at the acoustics, speech, and signal processing, 1988. ICASSP-88., 1988 International Conference on, Jun. pp. iv–265. https://doi.org/10.1109/ICASSP.2004.1326814
Li X, Sun Y (2020) Stock intelligent investment strategy based on support vector machine parameter optimization algorithm. Neural Comput Appl 32(6):1765–1775. https://doi.org/10.1007/s00521-019-04566-2
Sakuraba Y, Kitahara T, Okuno JG (2004) Comparing features for forming music streams in automatic music transcription. In: 2004 IEEE international conference on acoustics, speech, and signal processing. pp. iv–iv. https://doi.org/10.1109/ICASSP.2004.1326816
Eronen A (2001) Comparison of features for musical instrument recognition. Feb 19–22. https://doi.org/10.1109/ASPAA.2001.969532
Eronen A, Klapuri A (2000) Musical instrument recognition using cepstral features and temporal features. Feb. pp. II753-II756 vol.2. https://doi.org/10.1109/ICASSP.2000.859069
Agostini G, Longari M, Pollastri E (2003) Musical instrument timbres classification with spectral features. EURASIP J Adv Signal Process 2003(1):1 https://doi.org/10.1155/S1110865703210118
Blaszke M, Kostek B (2022) Musical instrument identification using deep learning approach. Sensors 22(8):8. https://doi.org/10.3390/s22083033
Essid S, Richard G, David B (2012) Efficient musical instrument recognition on solo performance music using basic features. FEATURES FOR INSTRUMENT RECOGNITION, 2012(6),pp:1109-1118.
Leppänen et al. (2018) Augmenting microsurgical training: microsurgical instrument detection using convolutional neural networks. In: 2018 IEEE 31st international symposium on computer-based medical systems (CBMS) pp. 211–216. https://doi.org/10.1109/CBMS.2018.00044
KletzS, SchoeffmannK, Benois-Pineau J, Husslein H (2019) Identifying surgical instruments in laparoscopy using deep learning instance segmentation. In 2019 Int Conf Content-Based Multimed Index (CBMI), pp. 1–6. https://doi.org/10.1109/CBMI.2019.8877379
Cao H (2022) Entrepreneurship education-infiltrated computer-aided instruction system for college music majors using convolutional neural network. Front Psychol 13:900195. https://doi.org/10.3389/fpsyg.2022.900195
Meng Z (2021) Research on timbre classification based on BP neural network and MFCC. J Phys Conf Ser 1856(1):012006. https://doi.org/10.1088/1742-6596/1856/1/012006
Zheng W, Zhou Y, Liu S, Tian J, Yang B,..., Yin L (2022) A deep fusion matching network semantic reasoning model. Appl Sci 12(7):3416. https://doi.org/10.3390/app12073416
Zhang K (2021) Music style classification algorithm based on music feature extraction and deep neural network . Wirel Commun Mob Comput 2021:e9298654. https://doi.org/10.1155/2021/9298654
Chen H, **ong Y, Li S, Song Z, Hu Z, Liu F (2022) Multi-sensor data driven with PARAFAC-IPSO-PNN for identification of mechanical nonstationary multi-fault mode. Machines 10(2):155. https://doi.org/10.3390/machines10020155
Wang Q et al (2021) Recent advances in electrochemical sensors for antibiotics and their applications. Chin Chem Lett 32(2):609–619
Chen Guomin, Chen Pengrun, Huang Wenxia, Zhai Jie (2022) Continuance intention mechanism of middle school student users on online learning platform based on qualitative comparative analysis method. Math Problems Eng 2022(12):3215337. https://doi.org/10.1155/2022/3215337
Jiang W, Liu J, Zhang X, Wang S, Jiang Y (2020) Analysis and modeling of timbre perception features in musical sounds. Appl Sci 10(3):3. https://doi.org/10.3390/app10030789
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, Y. Timbre-Based Portable Musical Instrument Recognition Using LVQ Learning Algorithm. Mobile Netw Appl (2023). https://doi.org/10.1007/s11036-023-02174-y
Accepted:
Published:
DOI: https://doi.org/10.1007/s11036-023-02174-y