Constrained and Unconstrained Audio Classifıcation

Prathima, T.; Govardhan, A.; Palla, Sreeja; Sri Yagna, K.

doi:10.1007/978-981-16-7330-6_75

T. Prathima¹⁸,
A. Govardhan¹⁹,
Sreeja Palla¹⁸ &
…
K. Sri Yagna¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1415))

524 Accesses

Abstract

Large amounts of audio data are available with the advent of technology. The role of audio data is decisive in analysing the data, be it activity recognition, event detection, etc. Classification of audio stream will help us to corroborate the results obtained from other media. We trained a CNN model to classify benchmark data sets ESC-10 and ESC-50. Along with these benchmark data sets, we tried a custom data set as well. CNN is trained on extracted low-level audio features from the custom and benchmark audio snippets which are both from constrained and noisy environments. We are able to identify CNN architecture with minimum layers which works good with both benchmark and custom data set. We also experimented to detect the most influencing feature which alone is sufficient to classify the multiple classes of audio data. Classification accuracy as high as 98% is reported.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 234.33; Price includes VAT (Germany)

Softcover Book: EUR 299.59; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Scanning dial: the instantaneous audio classification transformer

Article Open access 27 February 2024

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

Feature Analysis for Audio Classification

References

Eyben F (2016) Real-time speech and music classification by large audio feature space extraction. Springer International Publishing, Springer theses
Book Google Scholar
Schuller BW (2013) Intelligent audio analysis. Springer, Berlin, Heidelberg
Book Google Scholar
Paraskevas I, Chilton E (2003) Audio classification using acoustic images for retrieval from multimedia databases. In: Proceedings EC-VIP-MC 2003. 4th EURASIP conference focused on video/image processing and multimedia communications (IEEE Cat. No.03EX667), vol 1, pp 187–192. https://doi.org/10.1109/VIPMC.2003.1220460
Piczak KJ, Mohaimenuzzaman Md (2015) ESC: dataset for environmental sound classification
Google Scholar
Kumar A, Ithapu VK (2020) A sequential self teaching approach for improving generalization in sound event recognition
Google Scholar
Kim J (2020) Urban sound tagging using multi-channel audio feature with convolutional neural networks. AI Research Lab, IVS Inc, Seoul, South Korea
Google Scholar
Nanni L, Maguoloa G, Brahnam S, Paci M (2021) An ensemble of convolutional neural networks for audio classification
Google Scholar
Sailor HB, Agrawal DM, Patil HA (2017) Unsupervised filterbank learning using convolutional restricted Boltzmann machine for environmental sound classification. INTERSPEECH 2017, August 2017. Stockholm, Sweden, pp 3107–3111
Google Scholar
Huang JJ, Leanos JJA (2018) Aclnet: efficient end-to-end audio classification CNN
Google Scholar
Wilkinghoff K (2021) On open-set classification with L3-net embeddings for machine listening applications
Google Scholar
Tak RN, Agrawal D, Patil H (2017) Novel phase encoded mel filterbank energies for environmental sound classification
Google Scholar
Kumar A, Khadkevich M, Fugen C (2018) Knowledge transfer from weakly labeled audio using convolutional neural network for sound events and scenes
Google Scholar
Hu D, Nie F, Li X (2019) Deep multimodal clustering for unsupervised audio visual learning
Google Scholar
Agrawal DM, Sailor HB, Soni MH, Patil HA (2017) Novel TEO-based Gammatone features for environmental sound classification
Google Scholar
Xu Y, Kong Q, Wang W, Plumbley MD (2018) Large-scale weakly supervised audio classification using gated convolutional neural network. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 121–125. https://doi.org/10.1109/ICASSP.2018.8461975
Hershey S et al (2017) CNN architectures for large-scale audio classification. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 131–135. https://doi.org/10.1109/ICASSP.2017.7952132
Zeng Y, Mao H, Peng D, Yi Z (2019) Spectrogram based multi-task audio classification. Multimedia Tools Appl (Springer) 78:3705–3722
Article Google Scholar
Nannia L, Costab YM, Luciob DR, Silla CN Jr, Brahnamd S (2017) Combining visual and acoustic features for audio classification tasks. Pattern Recogn Lett 88:49–56
Article Google Scholar
Prathima T, Govardhan A, Ramadevi Y (2018) Rough set based classification of audio data. In: 3rd international conference on computational intelligence & informatics (ICCII-2018), December 2018. Hyderabad, Telangana, India
Google Scholar
https://github.com/karolpiczak/ESC-50
Freesound.org
Google Scholar
McFee B, Raffel C, Liang D, Ellis PW, McVicar M, Battenberg E, Nieto O (2015) Librosa: audio and music signal analysis in python. In: Proceedings of the 14th python in science conference, pp 18–25
Google Scholar
Nielsen MA (2015) Neural networks and deep learning. Determination Press
Google Scholar
Goodfellow, Bengio Y, Courville A (2016) Deep learning. MIT Press (e-book)
Google Scholar
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. The Morgan Kaufmann series in data management systems, 3rd edn. Morgan Kaufmann Publishers. ISBN 978–0123814791
Google Scholar

Download references

Author information

Authors and Affiliations

Chaitanya Bharathi Institute of Technology, Hyderabad, Telangana, India
T. Prathima, Sreeja Palla & K. Sri Yagna
Department of CSE, JNTUH, Hyderabad, Telangana, India
A. Govardhan

Authors

T. Prathima
View author publications
You can also search for this author in PubMed Google Scholar
A. Govardhan
View author publications
You can also search for this author in PubMed Google Scholar
Sreeja Palla
View author publications
You can also search for this author in PubMed Google Scholar
K. Sri Yagna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T. Prathima .

Editor information

Editors and Affiliations

Department of Computer Science Engineering, Vaigai College of Engineering, Madurai, Tamil Nadu, India
A. Pasumpon Pandian
Department of Business Administration, The Gerald Schwartz School of Business, Nova Scotia, NS, Canada
Ram Palanisamy
Department of Computer Science Engineering, Malla Reddy College of Engineering, Secunderabad, Telangana, India
M. Narayanan
University of the Ryukyus, Okinawa, Japan
Tomonobu Senjyu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Prathima, T., Govardhan, A., Palla, S., Sri Yagna, K. (2022). Constrained and Unconstrained Audio Classifıcation. In: Pandian, A.P., Palanisamy, R., Narayanan, M., Senjyu, T. (eds) Proceedings of Third International Conference on Intelligent Computing, Information and Control Systems. Advances in Intelligent Systems and Computing, vol 1415. Springer, Singapore. https://doi.org/10.1007/978-981-16-7330-6_75

Download citation

DOI: https://doi.org/10.1007/978-981-16-7330-6_75
Published: 15 March 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-7329-0
Online ISBN: 978-981-16-7330-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Constrained and Unconstrained Audio Classifıcation

Abstract

Access this chapter

Similar content being viewed by others

Scanning dial: the instantaneous audio classification transformer

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Feature Analysis for Audio Classification

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Constrained and Unconstrained Audio Classifıcation

Abstract

Access this chapter

Similar content being viewed by others

Scanning dial: the instantaneous audio classification transformer

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Feature Analysis for Audio Classification

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation