Explicit Length Modelling for Statistical Machine Translation

Silvestre-Cerdà, Joan Albert; Andrés-Ferrer, Jesús; Civera, Jorge

doi:10.1007/978-3-642-21257-4_34

Joan Albert Silvestre-Cerdà¹⁹,
Jesús Andrés-Ferrer¹⁹ &
Jorge Civera¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6669))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

3045 Accesses

Abstract

Explicit length modelling has been previously explored in statistical pattern recognition with successful results. In this paper, two length models along with two parameter estimation methods for statistical machine translation (SMT) are presented. More precisely, we incorporate explicit length modelling in a state-of-the-art log-linear SMT system as an additional feature function in order to prove the contribution of length information. Finally, promising experimental results are reported on a reference SMT task.

Work supported by the EC (FEDER/FSE) and the Spanish MEC/MICINN under the MIPRCV “Consolider Ingenio 2010” program (CSD2007-00018) and iTrans2 (TIN2009-14511) projects. Also supported by the Spanish MITyC under the erudito.com (TSI-020110-2009-439) project and by the Generalitat Valenciana under grant Prometeo/2009/014 and GV/2010/067, and by the “Vicerrectorado de Investigación de la UPV” under grant 20091027.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Minimum description length inference of phrase-based translation models

Article 10 March 2016

Statistical Machine Translation

Improving the Minimum Description Length Inference of Phrase-Based Translation Models

References

Andrés-Ferrer, J., Juan, A.: A phrase-based hidden semi-markov approach to machine translation. In: Proc. of EAMT, pp. 168–175 (2009)
Google Scholar
Brown, P.F., et al.: The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 19(2), 263–311 (1993)
Google Scholar
Brown, P.F., et al.: Aligning sentences in parallel corpora. In: Proc. of ACL, pp. 169–176 (1991)
Google Scholar
Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Proc. of ACL, pp. 310–318 (1996)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. Roy. Statistical Society. Series B 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
Deng, Y., Byrne, W.: HMM word and phrase alignment for statistical machine translation. IEEE Trans. Audio, Speech, and Lang. Proc. 16(3), 494–507 (2008)
Article Google Scholar
Gale, W.A., Church, K.W.: A program for aligning sentences in bilingual corpora. In: Proc. ACL, pp. 177–184 (1991)
Google Scholar
Giménez, A., et al.: Modelizado de la longitud para la clasificación de textos. In: Actas del I Workshop de Rec. de Formas y Análisis de Imágenes, pp. 21–28 (2005)
Google Scholar
Günter, S., Bunke, H.: HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and gaussian components. Pattern Recognition 37(10), 2069–2079 (2004)
Article Google Scholar
Kneser, R.: Statistical language modeling using a variable context length. In: Proc. of ICSLP (1996)
Google Scholar
Koehn, P.: Europarl: A parallel corpus for statistical machine translation. In: Proc. of the MT Summit X, pp. 79–86 (2005)
Google Scholar
Koehn, P.: Statistical significance tests for machine translation evaluation. In: Proc. of EMNLP, pp. 388–395 (2004)
Google Scholar
Koehn, P.: Stadistical Machine Translation. Cambridge University Press, Cambridge (2010)
MATH Google Scholar
Koehn, P., et al.: Statistical phrase-based translation. In: HLT, pp. 48–54 (2003)
Google Scholar
Koehn, P., et al.: Moses: Open source toolkit for statistical machine translation. In: Proc. of ACL (2007)
Google Scholar
Matusov, E., et al.: Automatic Sentence Segmentation and Punctuation Prediction for Spoken Language Translation. In: Proc. of IWSL, pp. 158–165 (2006)
Google Scholar
Och, F.J.: Minimum error rate training in statistical machine translation. In: Proc. of ACL, pp. 160–167 (2003)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.: Bleu: a Method for Automatic Evaluation of Machine Translation. Tech. rep., Watson Research Center (2001)
Google Scholar
Sanchis-Trilles, G., Casacuberta, F.: Log-linear weight optimisation via Bayesian Adaptation in Statistical Machine Translation. In: COLING, pp. 1077–1085 (2010)
Google Scholar
Sichel, H.S.: On a distribution representing sentence-length in written prose. J. Roy. Statistical Society. Series A 137(1), 25–34 (1974)
Article Google Scholar
Uzuner, Ö., Katz, B.: A comparative study of language models for book and author recognition. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 969–980. Springer, Heidelberg (2005)
Chapter Google Scholar
Venugopal, A., et al.: Effective phrase translation extraction from alignment models. In: Proc. of ACL, pp. 319–326 (2003)
Google Scholar
Zens, R., Ney, H.: N-gram posterior probabilities for statistical machine translation. In: Proceedings of WSMT, pp. 72–77 (2006)
Google Scholar
Zhao, B., Vogel, S.: A generalized alignment-free phrase extraction. In: Proc. of ACL Workshop on Building and Using Parallel Texts, pp. 141–144 (1995)
Google Scholar
Zimmermann, M., Bunke, H.: Hidden markov model length optimization for handwriting recognition systems. In: Proc. of IWFHR, pp. 369–374 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, Spain
Joan Albert Silvestre-Cerdà, Jesús Andrés-Ferrer & Jorge Civera

Authors

Joan Albert Silvestre-Cerdà
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Andrés-Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Civera
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departament de Matemàtica Aplicada i Anàlisi, Universitat de Barcelona, Facultat de Matemàtiques, Gran Via de les Corts Catalanes 585, 08007, Barcelona, Spain
Jordi Vitrià
Instituto de Sistemas e Robótica / Instituto Superior Técnico, Av. Rovisco Pais, 1, 1049-001, Lisbon, Portugal
João Miguel Sanches
Institute for Intelligent Systems and Numerical Applications in Engineering (SIANI), Edificio de Informática y Matemáticas, University of Las Palmas de Gran Canaria, Campus Universitario de Tafira, 35017, Las Palmas, Spain
Mario Hernández

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Silvestre-Cerdà, J.A., Andrés-Ferrer, J., Civera, J. (2011). Explicit Length Modelling for Statistical Machine Translation. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds) Pattern Recognition and Image Analysis. IbPRIA 2011. Lecture Notes in Computer Science, vol 6669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21257-4_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-21257-4_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21256-7
Online ISBN: 978-3-642-21257-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Explicit Length Modelling for Statistical Machine Translation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Minimum description length inference of phrase-based translation models

Statistical Machine Translation

Improving the Minimum Description Length Inference of Phrase-Based Translation Models

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Explicit Length Modelling for Statistical Machine Translation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Minimum description length inference of phrase-based translation models

Statistical Machine Translation

Improving the Minimum Description Length Inference of Phrase-Based Translation Models

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation