Acoustic-Phonetic Decoding of Speech

Schwartz, Richard M.; Chow, Y.; Dunham, M.; Kimball, O.; Krasner, M.; Kubala, F.; Makhoul, J.; Price, P.; Roucos, S.

doi:10.1007/978-3-642-83476-9_2

Richard M. Schwartz³,
Y. Chow³,
M. Dunham³,
O. Kimball³,
M. Krasner³,
F. Kubala³,
J. Makhoul³,
P. Price³ &
…
S. Roucos³

Part of the book series: NATO ASI Series ((NATO ASI F,volume 46))

99 Accesses
3 Citations

Abstract

Several methods for acoustic-phonetic decoding are reviewed. Emphasis is placed on the need for mathematical methods for speech recognition. Several examples of statistical methods are described. The author presents several techniques for incorporating “speech knowledge” into these statistical models, and provides a simple formalism for using multiple knowledge sources in a coherent speech recognition system.

While this paper was organized, written, and presented by the first author, several paragraphs of this paper are taken directly from several conference papers written by various combinations of all the authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Bayesian view on acoustic model-based techniques for robust speech recognition

Article Open access 02 December 2015

Comparison and Analysis of Several Phonetic Decoding Approaches

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

References

L.R. Bahl and F. Jelinek. Decoding for Channels with Insertions, Deletions, and Substitutions with Applications to Speech Recognition. IEEE Trans. Inform. Theory IT-21(4):404–411, July, 1975.
Article MATH Google Scholar
L.R. Bahl, F. Jelinek, and R.L. Mercer. A Maximum Likelihood Approach to Continuous Speech Recognition. IEEE Trans. Pattern Analysis and Machine Intelligence PAMI-5(2): 179–190, March, 1983.
Article Google Scholar
L.R. Bahl, P.F. Brown, P.V. deSouza, R. L. Mercer, and M.A. Picheny. A Method for the Construction of Acoustic Markov Models for Words. In IEEE Int. Conf. Acoust., Speech, Signal Processing. New York, NY, April, 1988.
Google Scholar
J.K. Baker. Stochastic Modeling for Automatic Speech Understanding. In Raj Reddy (editor), Speech Recognition, chapter Part Five:systems Organization and Analysis Systems, pages 521–542. Academic Press, New York, 1975.
Google Scholar
L.E. Baum and J.A. Eagon. An Inequality with Applications to Statistical Estimation for Probabilistic Functions of Markov Processes and to a Model of Ecology. Amer. Math Soc. Bulletin 73,:360–362, 1967.
Article MathSciNet MATH Google Scholar
J.S. Bridle and M.D. Brown. Connected Word Recognition Using Whole Word Templates. In Proc. of the Inst, of Acoustics. Autumn, 1979.
Google Scholar
/CHO 86/ Y.L. Chow, R.M. Schwartz, S. Roucos, O.A. Kimball, P.J. Price, G.F. Kubala, M.O. Dunham, M.A. Krasner, and J. Makhoul. The Role of Word-Dependent Coarticulatory Effects in a Phoneme-Based Speech Recognition System. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 1593–1596. Tokyo, Japan, April, 1986. Paper No. 30.9.
Google Scholar
Y.L. Chow, M.O. Dunham, O.A. Kimball, M.A. Krasner, G.F. Kubala, J. Makhoul, P.J. Price, S. Roucos, and R.M. Schwartz. BYBLOS: The BBN Continuous Soeech Recognition System. In IEEE Int. Conf. Acoust., Speech, Signú Processing, pages 89–92. Dallas, TX, April, 1987. Paper No. 3.7.
Google Scholar
M W Feng, F. Kubala, R.M. Schwartz, and J. Makhoul. Improved Speaker Adaptation using Text-Dependent Spectral Map**s. In IEEE Int. Conf. Acoust., Speech, Signal Processing. New York, NY, April, 1988.
Google Scholar
F. Kubala, Y. Chow, A. Derr, M. Feng, O. Kimball, J. Makhoul, P. Price, J. Rohlicek, S. Roucos, R. Schwartz, and J. Vandegrift. Continuous Speech Recognition Results of the BYBLOS System on the DARPA 1000-Word Resource Management Database. In IEEE Int. Conf. Acoust., Speech, Signal Processing. New York, NY, April, 1988.
Google Scholar
K.F. Lee. Speaker-Independent Continuous Speech Recognition Using Hidden Markov Models. In NATOAdvancedStudylnstitute. Bad Windheim, FR Germany, July, 1987. elsewhere in this volume.
Google Scholar
S.E. Levinson, L.R. Rabiner, and M.M. Sondhi. Speaker Independent Isolated Digit Recognition Using Hidden Markov Models. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 1049–1052. Boston, MA, April, 1983.
Google Scholar
H. Ney. Dynamic Programming Speech Recognition Using a Context Free Grammar. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 69–72. Dallas, TX, April, 1987. Paper No. 3.2.
Google Scholar
/PAE 87/ A. Paeseler. Modification of Earley’s Algorithm for Speech Understanding. In NATOAdvancedStudy Institute. Bad Windheim, FR Germany, July, 1987. elsewhere in this volume.
Google Scholar
P. Price, W. Fisher, J. Bernstein, and D. Pallett. The DARPA 1000-Word Resource Management Database for Continuous Speech Recognition. In IEEE Int. Conf. Acoust., Speech, Signal Processing. New York, NY, April, 1988.
Google Scholar
L.R. Rabiner. Mathematical Foundations and Applications of HMM. In NATOAdvancedStudy Institute. Bad Windheim, FR Germany, July, 1987. Invited paper. Elsewhere in this volume.
Google Scholar
S. Roucos, R. Schwartz, and J. Makhoul. Segment Quantization for Very-Low-Rate Speech Coding. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 1565–1569. Paris, France, May, 1982.
Google Scholar
S. Roucos, R. Schwartz, and J. Makhoul. A Segment Vocoder at 150 B/S. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 61–64. Boston, MA, April, 1983.
Google Scholar
S. Roucos and M.O. Dunham. A Stochastic Segment Model for Phoneme-Based Continuous Speech Recognition. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 73–89. Dallas, TX, April, 1987. Paper No. 3.3.
Google Scholar
R.M. Schwartz, Y. Chow, S. Roucos, M. Krasner, and J. Makhoul. Improved Hidden Markov Modeling of Phonemes for Continuous Speech Recognition. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 35.6.1–35.o.4. San Diego, CA, March, 1984.
Google Scholar
/SCH 85/ R.M. Schwartz, Y.L. Chow, O.A. Kimball, S. Roucos, M. Krasner, and J. Makhoul. Context-Dependent Modeling for Acoustic-Phonetic Recognition of Continuous Speech. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 1205–1208. Tampa, FL, March, 1985. Paper No. 31.3.
Google Scholar
R.M. Schwartz, Y.L. Chow, G.F. Kubala. Rapid Speaker Adaptation using a Probabilistic Spectral Map**. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 633–636. Dallas, TX, Apnf, 1987. Paper No. 15.3.
Google Scholar
T.K. Vintsiuk. Generative Grammars and Dynamic Programming in Speech Recognition with Learning. In IEEE Int. Conf. Acoust., Speech, Signal Processing, pages 446–449. Philadelphia, PA, April, 1976.
Google Scholar

Download references

Author information

Authors and Affiliations

BBN Laboratories Inc., 10 Moulton Street, 02238, Cambridge, MA, USA
Richard M. Schwartz, Y. Chow, M. Dunham, O. Kimball, M. Krasner, F. Kubala, J. Makhoul, P. Price & S. Roucos

Authors

Richard M. Schwartz
View author publications
You can also search for this author in PubMed Google Scholar
Y. Chow
View author publications
You can also search for this author in PubMed Google Scholar
M. Dunham
View author publications
You can also search for this author in PubMed Google Scholar
O. Kimball
View author publications
You can also search for this author in PubMed Google Scholar
M. Krasner
View author publications
You can also search for this author in PubMed Google Scholar
F. Kubala
View author publications
You can also search for this author in PubMed Google Scholar
J. Makhoul
View author publications
You can also search for this author in PubMed Google Scholar
P. Price
View author publications
You can also search for this author in PubMed Google Scholar
S. Roucos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universität Erlangen-Nürnberg, Martensstr. 3, D-8520, Erlangen, Germany
H. Niemann & G. Sagerer &
ZT ZTI SYS 5, Siemens AG, Otto-Hahn-Ring 6, D-8000, München 83, Germany
M. Lang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schwartz, R.M. et al. (1988). Acoustic-Phonetic Decoding of Speech. In: Niemann, H., Lang, M., Sagerer, G. (eds) Recent Advances in Speech Understanding and Dialog Systems. NATO ASI Series, vol 46. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-83476-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-83476-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-83478-3
Online ISBN: 978-3-642-83476-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Acoustic-Phonetic Decoding of Speech

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Bayesian view on acoustic model-based techniques for robust speech recognition

Comparison and Analysis of Several Phonetic Decoding Approaches

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Acoustic-Phonetic Decoding of Speech

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Bayesian view on acoustic model-based techniques for robust speech recognition

Comparison and Analysis of Several Phonetic Decoding Approaches

A Decade of Discriminative Language Modeling for Automatic Speech Recognition

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation