Cosine Distance Metric Learning for Speaker Verification Using Large Margin Nearest Neighbor Method

  • Conference paper
Advances in Multimedia Information Processing – PCM 2014 (PCM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8879))

Included in the following conference series:

Abstract

In this paper, a novel cosine similarity metric learning based on large margin nearest neighborhood (LMNN) is proposed for an i-vector based speaker verification system. Generally, in an i-vector based speaker verification system, the decision is based on the cosine distance between the test i-vector and target i-vector. Metric learning methods are employed to reduce the within class variation and maximize the between class variation. In this proposed method, cosine similarity large margin nearest neighborhood (CSLMNN) metric is learned from the development data. The test and target i-vectors are linearly transformed using the learned metric. The objective of learning the metric is to ensure that the k-nearest neighbors that belong to the same speaker are clustered together, while impostors are moved away by a large margin. Experiments conducted on the NIST-2008 and YOHO databases show improved performance compared to speaker verification system, where no learned metric is used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Tomi, K., Li, H.: A tutorial on text-independent speaker verification. Speech Communication 52, 12–40 (2010)

    Article  Google Scholar 

  2. Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing 19(4), 788–798 (2011)

    Article  Google Scholar 

  3. R.D.A.Q.T.F.,, D.R.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10(1), 19–41

    Google Scholar 

  4. S.M., J.T.: Learning a distance metric from relative comparisons. In: Advances in Neural Information Processing Systems, vol. 16, p. 41 (2004)

    Google Scholar 

  5. Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information theoretic metric learning. In: Proc. Int. Conf. Mach. Learn., pp. 209–216 (2007)

    Google Scholar 

  6. B.J.W.K.Q., S.L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in Neural Information Processing Systems, pp. 1473–1480 (2005)

    Google Scholar 

  7. Yang, L.: An overview of distance metric learning. In: Proceedings of the Computer Vision and Pattern Recognition Conference (2007)

    Google Scholar 

  8. **ng, E.P., Jordan, M.I., Russell, S., Ng, A.: Distance metric learning with application to clustering with side-information. In: Advances in Neural Information Processing Systems, pp. 505–512 (2002)

    Google Scholar 

  9. Scheffer, K.S.S.N., Graciarena, M., Shriberg, E., Stolcke, A., Ferrer, L., Bocklet, T.: The sri nist 2008 speaker recognition evaluation system. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4205–4208 (2009)

    Google Scholar 

  10. Li, H., Ma, B., Lee, K.-A., Sun, H., Zhu, D., Sim, K.C., You, C.: The i4u system in nist 2008 speaker recognition evaluation. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4201–4204 (2009)

    Google Scholar 

  11. Campbell Jr., J.P.: Testing with the yoho cd-rom voice verification corpus. In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1995., vol. 1, pp. 341–344. IEEE (1995)

    Google Scholar 

  12. Martin, A., Doddington, G.: The det curve in assessment of detection task performance. In: Proc. Eurospeech, vol. 97(4), pp. 1895–1898 (1997)

    Google Scholar 

  13. B.N., de Villiers, E.: The bosaris toolkit: Theory, algorithms and code for surviving the new dcf. ar**v preprint ar**v, 1304.2865 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ahmad, W., Karnick, H., Hegde, R.M. (2014). Cosine Distance Metric Learning for Speaker Verification Using Large Margin Nearest Neighbor Method. In: Ooi, W.T., Snoek, C.G.M., Tan, H.K., Ho, CK., Huet, B., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2014. PCM 2014. Lecture Notes in Computer Science, vol 8879. Springer, Cham. https://doi.org/10.1007/978-3-319-13168-9_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13168-9_33

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13167-2

  • Online ISBN: 978-3-319-13168-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation