Abstract
New interactive music services have emerged, but many of them use proprietary file formats. In order to enable interoperability among these services, the International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) Moving Picture Experts Group (MPEG) issued a new standard, the so-called MPEG-A: Interactive Music Application Format (GlossaryTerm
IM AF
).The purpose of this chapter is to review the IM AF standard and its features, and also to provide a detailed description of the design and implementation of an IM AF codec and its integration into a popular open source analysis, annotation and visualization audio tool known as Sonic Visualiser. This is followed by a discussion highlighting the benefits of their combined features, such as automatic chords or melody extraction time-aligned with the song's lyrics. Furthermore, this integration provides the semantic music research community with a testbed enabling further development and comparison of new Sonic Visualiser plug-ins, e. g., from singing voice-to-text conversion with automatic lyrics highlighting for karaoke applications, to source separation-based music instrument extraction from a mixed song.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Abbreviations
- IM AF:
-
interactive music application format
- ISO-BMFF:
-
ISO based media file format
- ISRC:
-
international standard recording code
- SV:
-
Sonic Visualiser
References
P. Kudumakis: MP3: Something’s gotta change!, Audio! 1(3), 6 (2011)
I. Jang, P. Kudumakis, M. Sandler, K. Kang: The MPEG interactive music application format standard, IEEE Sig. Process. Mag. 28(1), 150–154 (2011)
iKlax Media: http://www.iklaxmusic.com (last accessed 12.01.14)
MOGG files: Multitrack Digital Audio Format, http://moggfiles.wordpress.com (last accessed 12.01.14)
MT9: http://en.wikipedia.org/wiki/MT9 (last accessed 12.01.14)
ISO/IEC 23000-12:2010 – Information technology – Multimedia application format (MPEG-A) – Part 12: Interactive music application format
ISO/IEC 23000-12:2010/Amd.2:2012 – Information technology – Multimedia application format (MPEG-A) – Part 12: Interactive music application format, AMENDMENT 2: Compact representation of dynamic volume change and audio equalization
J.C. Garcia, C. Taglialatela, P. Kudumakis, L.J. Tardon, I. Barbancho, M. Sandler: Interactive music applications by MPEG-A support in Sonic Visualiser. In: AES 53rd Int. Conf. Semant. Audio, London (2014)
C. Cannam, C. Landone, M. Sandler: Sonic Visualiser: An open source application for viewing, analysing, and annotating music audio files. In: Proc. ACM Multimedia Int. Conf. (2010)
M. Mauch, S. Dixon: Approximate note transcription for the improved identification of difficult chords. In: Proc. Int. Symp. Music Inf. Retriev. (2010) pp. 135–140
J. Salamon, E. Gómez: Melody extraction from polyphonic music signals using pitch contour characteristics, IEEE Trans. Audio Speech Lang. Proc. 20(6), 1759–1770 (2012)
ISO/IEC 23003-2:2010 – Information technology – MPEG audio technologies – Part 2: Spatial Audio Object Coding (SAOC)
ISO/IEC 10918-1:1994 – Information technology – Digital compression and coding of continuous-tone still images (JPEG)
ETS 3GPP TS 26.245-2004 – Transparent end-to-end Packet switched Streaming Service (PSS); Timed text format
ISO/IEC 15938-5:2003 – Information technology – Multimedia content description interface – Part 5: Multimedia description schemes
ISO/IEC 14496-12:2008 – Information technology – Coding of audio-visual objects – Part 12: ISO base media file format
C. Taglialatela: MPEG IM AF encoder: Features development, BSc Thesis (Seconda Università degli Studi di Napoli, Napoli 2013)
P. Kudumakis: MPEG developments https://code.soundsoftware.ac.uk/projects/mpegdevelopments (last accessed 12.01.14)
T. Hosoya, M. Suzuki, A. Ito, S. Makino: Lyrics recognition from a singing voice based on finite state automation for music information retrieval. In: Proc. Int. Symp. Music Inf. Retriev. (2005) pp. 532–535
J. Han, Z. Rafii, B. Pardo: Audio source separation and REPEAT, Research projects of Northwestern University, Dep. of Elec. Eng. and Comp. Sc., http://music.cs.northwestern.edu (last accessed 12.01.14)
G. Herrero, P. Kudumakis, L.J. Tardon, I. Barbancho, M. Sandler: An HTML5 interactive (MPEG-A IM AF) music player. In: 10th Int. Symp. Comput. Music Multidiscip. Res. (CMMR), Marseille (2013)
Acknowledgements
Panos Kudumakis acknowledges that this work was partially done during his visit at the University of Malaga in the context of the program Andalucía TECH: Campus of International Excellence and in conjunction to UK EPSRC project EP/H043101/1 SoundSoftware.ac.uk. This work has been partially funded by the Ministerio de Economía y Competitividad of the Spanish Government under Project No. TIN2016-75866-C3-2-R.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Corral García, J., Kudumakis, P., Barbancho, I., Tardón, L.J., Sandler, M. (2018). Enabling Interactive and Interoperable Semantic Music Applications. In: Bader, R. (eds) Springer Handbook of Systematic Musicology. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-55004-5_45
Download citation
DOI: https://doi.org/10.1007/978-3-662-55004-5_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-55002-1
Online ISBN: 978-3-662-55004-5
eBook Packages: EngineeringEngineering (R0)