Log in

Efficient parallel kernel based on Cholesky decomposition to accelerate multichannel nonnegative matrix factorization

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Multichannel source separation has been a popular topic, and recently proposed methods based on the local Gaussian model have provided promising results despite its high computational cost when several sensors are used. This drawback limits the practical application of this approach to tasks such as sound field reconstruction or virtual reality. In this study, we presented a numerical approach to reduce the complexity of multichannel nonnegative matrix factorization to address the task of audio source separation for scenarios with a large number of sensors, such as high-order ambisonics encoding. In particular, we proposed a parallel driver to compute the multiplicative update rules in MNMF approaches. It is designed to function on both sequential and multicore computers, as well as graphics processing units (GPUs). The solution attempts to reduce the computational cost of multiplicative update rules by using Cholesky decomposition and solving several triangular equation systems. The driver was evaluated for different scenarios, with promising results in terms of execution times for both the CPU and GPU. To the best of our knowledge, this proposal is the first to address the problem of reducing the computational cost of full-rank MNMF-based systems using parallel and high-performance techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Thailand)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

The data generated in this study are available from the corresponding author upon request.

Notes

  1. http://ambisonics.ch/

  2. https://github.com/QHPC-SP-Research-Lab/Parallel-Cholesky-MNMF

  3. https://ark.intel.com/

  4. https://www.nvidia.com/en-us/data-center/tesla-p100/

  5. https://www.nvidia.com/en-us/design-visualization/rtx-a6000/

References

  1. Wien M, Boyce JM, Stockhammer T, Peng WH (2019) Standardization status of immersive video coding. IEEE J Emerg Select Top Circu Syst 9(1):5–17. https://doi.org/10.1109/JETCAS.2019.2898948

    Article  Google Scholar 

  2. Parry RM, Essa I (2006) Estimating the Spatial Position of Spectral Components in Audio. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). p. 666–673. doi: https://doi.org/10.1007/11679363_83

  3. FitzGerald D (2005) Non-negative tensor factorisation for sound source separation. In: IEE Irish Signals and Systems Conference 2005. vol. 2005. IEE. p. 8–12. doi: https://doi.org/10.1049/cp_20050279

  4. Lee S, Park SH, Sung KM (2012) Beamspace-domain multichannel nonnegative matrix factorization for audio source separation. IEEE Signal Process Lett 19(1):43–46. https://doi.org/10.1109/LSP.2011.2173192

    Article  Google Scholar 

  5. Pezzoli M, Carabias-Orti JJ, Cobos M, Antonacci F, Sarti A (2021) Ray-Space-based multichannel nonnegative matrix factorization for audio source separation. IEEE Signal Process Lett 28:369–373. https://doi.org/10.1109/LSP.2021.3055463

    Article  Google Scholar 

  6. Bianchi L, D’Amelio F, Antonacci F, Sarti A, Tubaro S (2015) A plenacoustic approach to acoustic signal extraction. In: 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). p. 1–5

  7. Marković D, Antonacci F, Bianchi L, Tubaro S, Sarti A (2016) Extraction of acoustic sources through the processing of sound field maps in the ray space. IEEE/ACM Trans Audio Speech Lang Process 24(12):2481–2494. https://doi.org/10.1109/TASLP.2016.2615242

    Article  Google Scholar 

  8. Peled Y, Rafaely B (2013) Linearly-constrained minimum-variance method for spherical microphone arrays based on plane-wave decomposition of the sound field. IEEE Trans Audio Speech Langu Process 21(12):2532–2540. https://doi.org/10.1109/TASL.2013.2277939

    Article  Google Scholar 

  9. Mitsufuji Y, Takamune N, Koyama S, Saruwatari H (2021) Multichannel blind source separation based on evanescent-region-aware non-negative tensor factorization in spherical harmonic domain. IEEE/ACM Trans Audio Speech Langu Process 29:607–617. https://doi.org/10.1109/TASLP.2020.3045528

    Article  Google Scholar 

  10. Nikunen J, Virtanen T (2014) Direction of arrival based spatial covariance model for blind sound source separation. IEEE/ACM Trans Audio Speech Langu Process 22(3):727–739. https://doi.org/10.1109/TASLP.2014.2303576

    Article  Google Scholar 

  11. Ozerov A, Févotte C, Vincent E (2018) In: Makino S, editor. An Introduction to Multichannel NMF for Audio Source Separation. Cham: Springer International Publishing. p. 73–94. doi: https://doi.org/10.1007/978-3-319-73031-8_4

  12. Carabias-Orti JJ, Nikunen J, Virtanen T, Vera-Candeas P (2018) Multichannel blind sound source separation using spatial covariance model with level and time differences and nonnegative matrix factorization. IEEE/ACM Trans Audio Speech Langu Process 26(9):1512–1527. https://doi.org/10.1109/TASLP.2018.2830105

    Article  Google Scholar 

  13. Sekiguchi K, Nugraha AA, Bando Y, Yoshii K (2019) Fast Multichannel Source Separation Based on Jointly Diagonalizable Spatial Covariance Matrices. In: 2019 27th European Signal Processing Conference (EUSIPCO). IEEE. p. 1–5. Available from: https://ieeexplore.ieee.org/document/8902557/

  14. Muñoz-Montoro AJ, Carabias-Orti JJ, Cabañas-Molero P, Cañadas-Quesada FJ, Ruiz-Reyes N (2022) Multichannel blind music source separation using directivity-aware mnmf with harmonicity constraints. IEEE Access 10:17781–17795. https://doi.org/10.1109/ACCESS.2022.3150248

    Article  Google Scholar 

  15. Sekiguchi K, Bando Y, Nugraha AA, Fontaine M, Yoshii K (2021) Autoregressive Fast Multichannel Nonnegative Matrix Factorization For Joint Blind Source Separation And Dereverberation. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). p. 511–515

  16. Koyama S, Daudet L (2019) Sparse Representation of a spatial sound field in a reverberant environment. IEEE J Select Top Signal Process 13(1):172–184. https://doi.org/10.1109/JSTSP.2019.2901127

    Article  Google Scholar 

  17. Mitsufuji Y, Takamune N, Koyama S, Saruwatari H (2021) Multichannel blind source separation based on evanescent-region-aware non-negative tensor factorization in spherical harmonic domain. IEEE ACM Trans Audio Speech Lang Process 29:607–617. https://doi.org/10.1109/TASLP.2020.3045528

    Article  Google Scholar 

  18. Borra F, Krenn S, Gebru ID, Marković D (2019) 1ST-Order Microphone Array System for Large Area Sound Field Recording and Reconstruction: Discussion and Preliminary Results. In: 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). p. 378–382

  19. Boyd SP, Lieven V (2018) Introduction to applied linear algebra : vectors matrices and least squares. Cambridge University Press, Cambridge UK

    Book  MATH  Google Scholar 

  20. Ito N, Nakatani T (2019) FastMNMF: Joint Diagonalization Based Accelerated Algorithms for Multichannel Nonnegative Matrix Factorization. In: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE. p. 371–375. Available from: https://ieeexplore.ieee.org/document/8682291/

  21. Sekiguchi K, Bando Y, Nugraha AA, Yoshii K, Kawahara T (2020) Fast multichannel nonnegative matrix factorization with directivity-aware jointly-diagonalizable spatial covariance matrices for blind source separation. IEEE/ACM Trans Audio Speech Langu Process 28:2610–2625. https://doi.org/10.1109/TASLP.2020.3019181

    Article  Google Scholar 

  22. Mitsufuji Y, Uhlich S, Takamune N, Kitamura D, Koyama S, Saruwatari H (2020) Multichannel non-negative matrix factorization using banded spatial covariance matrices in wavenumber domain. IEEE/ACM Trans Audio Speech Langu Process 28:49–60. https://doi.org/10.1109/TASLP.2019.2948770

    Article  Google Scholar 

  23. Sawada H, Kameoka H, Araki S, Ueda N (2013) Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Trans Audio Speech Langu Process 21(5):971–982. https://doi.org/10.1109/TASL.2013.2239990

    Article  Google Scholar 

  24. Nørholm SM, Jensen JR, Christensen MG (2016) Instantaneous fundamental frequency estimation with optimal segmentation for nonstationary voiced speech. IEEE/ACM Trans Audio Speech Langu Process 24(12):2354–2367. https://doi.org/10.1109/TASLP.2016.2608948

    Article  Google Scholar 

  25. Kuklasiński A, Doclo S, Jensen SH, Jensen J (2016) Maximum likelihood PSD estimation for speech enhancement in reverberation and noise. IEEE/ACM Trans Audio Speech Langu Process 24(9):1599–1612. https://doi.org/10.1109/TASLP.2016.2573591

    Article  Google Scholar 

  26. Liutkus A, Badeau R, Richard G (2011) Gaussian processes for underdetermined source separation. IEEE Trans Signal Process 59(7):3155–3167. https://doi.org/10.1109/TSP.2011.2119315

    Article  MathSciNet  MATH  Google Scholar 

  27. Yoshii K, Tomioka R, Mochihashi D, Goto M (2013) Beyond NMF: Time-Domain Audio Source Separation without Phase Reconstruction. In: ISMIR p. 369–374

  28. Kim J, Park H (2011) Fast nonnegative matrix factorization: an active-set-like method and comparisons. SIAM J Scient Comput 33(6):3261–3281. https://doi.org/10.1137/110821172

    Article  MathSciNet  MATH  Google Scholar 

  29. Alostad JM (2017) Reducing Dimensionality Using NMF Based Cholesky Decomposition. In: Proceedings of the International Conference on Research in Adaptive and Convergent Systems. RACS ’17. New York, NY, USA: Association for Computing Machinery. p. 49–55. doi: https://doi.org/10.1145/3129676.3129697

  30. Nikunen J, Politis A (2018) Multichannel NMF for Source Separation with Ambisonic Signals. In: 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC). IEEE. p. 251–255. Available from: https://ieeexplore.ieee.org/document/8521344/

  31. Sawada H, Araki S, Makino S (2011) Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans Audio Speech Langu Process 19(3):516–527. https://doi.org/10.1109/TASL.2010.2051355

    Article  Google Scholar 

  32. Ferreira Lima JV, Raïs I, Lefevre L, Gautier T (2019) Performance and energy analysis of OpenMP runtime systems with dense linear algebra algorithms. The Int J High Perform Comput Appl 33(3):431–443

    Article  Google Scholar 

Download references

Funding

This work was supported by MCIN/AEI/10.13039/501100011033 under project grant PID2020-119082RB-{C21,C22}, and Gobierno del Principado de Asturias under grant AYUD/2021/50994.

Author information

Authors and Affiliations

Authors

Contributions

All the authors contributed equally to this study.

Corresponding author

Correspondence to Antonio J. Muñoz-Montoro.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Ethical Approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Muñoz-Montoro, A.J., Carabias-Orti, J.J., Salvati, D. et al. Efficient parallel kernel based on Cholesky decomposition to accelerate multichannel nonnegative matrix factorization. J Supercomput 79, 20649–20664 (2023). https://doi.org/10.1007/s11227-023-05471-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05471-1

Keywords

Navigation