Translation of Idiomatic Expressions Across Different Languages: A Study of the Effectiveness of TransSearch

  • Chapter
  • First Online:
Where Humans Meet Machines

Abstract

This chapter presents a case study relating how a user of TransSearch, a translation spotter as well as a bilingual concordancer available over the Web, can use the tool for finding translations of idiomatic expressions. We show that by paying close attention to the queries made to the system, TransSearch can effectively identify a fair number of idiomatic expressions and their translations. For indicative purposes, we compare the translations identified by our application to those returned by Google Translate and conduct a survey of recent Computer-Assisted Translation tools with similar functionalities to TransSearch.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 85.59
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 106.99
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 106.99
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    At the time of writing, Google Translate produces the literal translation “Il ne pouvait pas dire boo à une oie.”

  2. 2.

    Idioms—named locutions in French—are seen in phraseology as a subcategory of phrasemes and are used in the remainder of this chapter as a synonymous of idiomatic expressions.

  3. 3.

    Shown in small caps in the examples.

  4. 4.

    http://www.trados.com.

  5. 5.

    http://www.atril.com.

  6. 6.

    http://terminotix.com.

  7. 7.

    http://www.multicorpora.

  8. 8.

    http://lucene.apache.org.

  9. 9.

    In order to account for inflectional variations, we compared lemmatized translations.

  10. 10.

    The 10 most frequent ones are: est à nos portes, arrive à grand pas, était imminent, nous attend, me guette, est sur le point, s’annonce, est en vue, sommes au bord de, and survenir.

  11. 11.

    http://translate.google.com.

  12. 12.

    http://www.linguee.com.

  13. 13.

    http://www.tradooit.com.

  14. 14.

    http://www.opensubtitles.org.

  15. 15.

    http://www.termiumplus.gc.ca.

  16. 16.

    http://www.tsrali3.com.

  17. 17.

    http://www.terminotix.com.

References

  • Anastasiou D (2008) Identification of idioms by machine translation: a hybrid research system vs. three commercial systems. In: Proceedings of EAMT, pp 12–20, Hamburg, Germany, 2008

    Google Scholar 

  • Bourdaillet J, Huet S, Langlais P, Lapalme G (2010) TransSearch: from a bilingual concordancer to a translation finder. Mach Translat 24(3–4):241–271

    Article  Google Scholar 

  • Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Ling 19(2):2

    Google Scholar 

  • Callison-Burch C, Bannard C, Shroeder J (2005) A compact data structure for searchable translation memories. In: Proceedings of EAMT, pp 59–65, Budapest, Hungary, 2005

    Google Scholar 

  • Carpuat M, Diab M (2010) Task-based evaluation of multiword expressions: a pilot study in statistical machine translation. In: Proceedings of NAACL-HLT, pp 242–245, Los Angeles, CA, USA, 2010

    Google Scholar 

  • Fazly A, Cook P, Stevenson S (2009) Unsupervised type and token identification of idiomatic expressions. Comput Ling 35(1):61–103

    Article  Google Scholar 

  • Fleiss JL, Levin B, Pai MC (2003) Statistical methods for rates and proportions, 3rd edn. Wiley, New York

    Book  MATH  Google Scholar 

  • Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of HLT-NAACL, vol 1, pp 48–54, Edmonton, Canada, 2003

    Google Scholar 

  • Lambert P, Banchs R (2005) Data inferred multi-word expressions for statistical machine translation. In: Proceedings of MT summit, pp 396–403, Phuket, Thailand, 2005

    Google Scholar 

  • Langlais P (1997) A system to align complex bilingual corpora. Technical report, CTT, KTH, Stockholm, Sweden, 1997

    Google Scholar 

  • Macklovitch E, Lapalme G, Gotti F (2008) TransSearch: what are translators looking for? In: Proceedings of AMTA, pp 412–419, Waikiki, Hawaii, USA, 2008

    Google Scholar 

  • Macklovitch E, Simard M, Langlais P (2000) TransSearch: a free translation memory on the World Wide Web. In: Proceedings of LREC, pp 1201–1208, Athens, Greece, 2000

    Google Scholar 

  • McArthur T (ed) (1992) The Oxford companion to the english language. Oxford University Press, Oxford

    Google Scholar 

  • Mel’čuk I (1995) Idioms: structural and psychological perspectives, chapter phrasemes in language and phraseology in linguistics. Lawrence Erlbaum, Hillsdale, NJ, pp 167–232

    Google Scholar 

  • Mel’čuk I (2010) La phraséologie en langue, en dictionnaire et en TALN. In: Actes de la 17ème conférence sur le Traitement Automatique des Langues Naturelles (TALN), Montreal, Canada, 2010

    Google Scholar 

  • Névéol A, Ozdowska S (2006) Terminologie médicale bilingue anglais/français: usages clinique et législatif. Glottopol 8:5–21

    Google Scholar 

  • Piat J-B (2008) It’s raining cats and dogs et autres expressions idiomatiques anglaises. J’ai lu, Librio, 2008

    Google Scholar 

  • Polguère A (2008) Lexicologie et sémantique lexicale: notions fondamentales, 2e édition Les Presses de l’Université de Montréal, Alain Polguère, Paramètres, p 356

    Google Scholar 

  • Ren Z, Lü Y, Cao J, Liu Q, Huang Y (2009) Improving statistical machine translation using domain bilingual multiword expressions. In: Proceedings of the ACL-IJCNLP workshop on multiword expressions, pp 47–54, Suntec, Singapore, 2009

    Google Scholar 

  • Sag IA, Baldwin T, Bond F, Copestake A, Flickinger D (2002) Multiword expressions: a pain in the neck for NLP. In: Proceedings of CICLing, vol 2276 of Lecture Notes in Computer Science. Springer, Mexico City, pp 1–15

    Google Scholar 

  • Simard M (2003) Translation spotting for translation memories. In: Proceedings of the HLT-NAACL workshop on building and using parallel texts: data driven machine translation and beyond, vol 3, pp 65–72, Edmonton, Canada, 2003

    Google Scholar 

  • Takeuchi K, Kanehila T, Hilao K, Abekawa T, Kageura K (2007) Flexible automatic look-up of english idiom entries in dictionaries. In: Proceedings of MT summit, pp 451–458, Copenhagen, Denmark, 2007

    Google Scholar 

  • Véronis J, Langlais P (2000) Evaluation of parallel text alignment systems—The Arcade Project., Chap 19. Kluwer Academic, the Netherlands, pp 369–388

    Google Scholar 

  • Volk M (1998) The automatic translation of idioms: machine translation vs. translation memory systems. In: Weber N (ed) Machine translation: theory, applications, and evaluations: an assessment of the state-of-the-art Gardez! Verlag, St. Augustin, pp 167–192. http://dl.acm.org/citation.cfm?id=328552&CFID=211791839&CFTOKEN=31638199

Download references

Acknowledgements

This work was funded by an NSERC grant in collaboration with Terminotix.Footnote 17 We are indebted to Sandy Dincky, Fabienne Venant, and Neil Stewart who kindly participated to the annotation task.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphane Huet .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Huet, S., Langlais, P. (2013). Translation of Idiomatic Expressions Across Different Languages: A Study of the Effectiveness of TransSearch . In: Neustein, A., Markowitz, J. (eds) Where Humans Meet Machines. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6934-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-6934-6_9

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-6933-9

  • Online ISBN: 978-1-4614-6934-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation