Mismatching-aware unsupervised translation quality estimation for low-resource languages

Azadi, Fatemeh; Faili, Heshaam; Dousti, Mohammad Javad

doi:10.1007/s10579-024-09727-x

Mismatching-aware unsupervised translation quality estimation for low-resource languages

Original Paper
Published: 05 May 2024

(2024)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

47 Accesses
Explore all metrics

Abstract

Translation Quality Estimation (QE) is the task of predicting the quality of machine translation (MT) output without any reference. This task has gained increasing attention as an important component in the practical applications of MT. In this paper, we first propose XLMRScore, which is a cross-lingual counterpart of BERTScore computed via the XLM-RoBERTa (XLMR) model. This metric can be used as a simple unsupervised QE method, nevertheless facing two issues: firstly, the untranslated tokens leading to unexpectedly high translation scores, and secondly, the issue of mismatching errors between source and hypothesis tokens when applying the greedy matching in XLMRScore. To mitigate these issues, we suggest replacing untranslated words with the unknown token and the cross-lingual alignment of the pre-trained model to represent aligned words closer to each other, respectively. We evaluate the proposed method on four low-resource language pairs of the WMT21 QE shared task, as well as a new English\(\rightarrow\)Persian (En-Fa) test dataset introduced in this paper. Experiments show that our method could get comparable results with the supervised baseline for two zero-shot scenarios, i.e., with less than 0.01 difference in Pearson correlation, while outperforming unsupervised rivals in all the low-resource language pairs for above 8%, on average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised Machine Translation Quality Estimation in Black-Box Setting

Towards Making the Most of LLM for Translation Quality Estimation

Improved Quality Estimation of Machine Translation with Pre-trained Language Representation

Notes

References

Aldarmaki, H., & Diab, M. (2019). Context-aware cross-lingual map**. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 3906–3911.
Banerjee, S., & Lavie, A. (2005). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp 65–72.
Cao, S., Kitaev, N., & Klein, D. (2019). Multilingual alignment of contextual word representations. In: International Conference on Learning Representations.
Chen, Y., Su, C., & Zhang, Y. et al. (2021). HW-TSC’s participation at WMT 2021 quality estimation shared task. In: Proceedings of the Sixth Conference on Machine Translation, pp 890–896.
Conneau, A., & Lample, G. (2019). Cross-lingual language model pretraining. Advances in neural information processing systems 32.
Conneau, A., Khandelwal, K., & Goyal, N. et al. (2020). Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pp 8440–8451.
Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. Advances in Neural Information Processing Systems, 26, 2292.
Google Scholar
Devlin, J., Chang, M.W., Lee, K. et al. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186.
do Carmo, F., Shterionov, D., & Moorkens, J., et al. (2021). A review of the state-of-the-art in automatic post-editing. Machine Translation, 35(2), 101–143.
Edelsbrunner, H., & Morozov, D. (2012). Persistent homology: Theory and practice. In: Proceedings of the European Congress of Mathematics. European Mathematical Society, pp 31–50.
Etchegoyhen, T., Garcia, E.M., & Azpeitia, A. (2018). Supervised and unsupervised minimalist quality estimators: Vicomtech’s participation in the wmt 2018 quality estimation task. In: Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pp 782–787.
Fomicheva, M., Sun, S., & Fonseca, E. et al. (2022). MLQE-PE: A multilingual quality estimation and post-editing dataset. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, pp 4963–4974.
Fomicheva, M., Sun, S., Yankovskaya, L., et al. (2020). Unsupervised quality estimation for neural machine translation. Transactions of the Association for Computational Linguistics, 8, 539–555.
Article Google Scholar
Guzmán, F., Chen, P.J., & Ott, M. et al. (2019). The FLORES evaluation datasets for low-resource machine translation: Nepali–English and Sinhala–English. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 6098–6111.
Huang, H., Liang, Y., & Duan, N. et al. (2019). Unicoder: A universal language encoder by pre-training with multiple cross-lingual tasks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 2485–2494.
Ive, J., Blain, F., & Specia, L. (2018). Deepquest: a framework for neural-based quality estimation. In: Proceedings of the 27th International Conference on Computational Linguistics, pp 3146–3157.
Jabbari, F., Bakshaei, S., & Ziabary, S.M.M. et al. (2012). Develo** an open-domain English-Farsi translation system using AFEC: Amirkabir bilingual Farsi-English Corpus. In: Fourth Workshop on Computational Approaches to Arabic-Script-based Languages, pp 17–23.
Junczys-Dowmunt, M., Grundkiewicz, R., & Dwojak, T. et al. (2018). Marian: Fast neural machine translation in C++. In: Proceedings of ACL 2018, System Demonstrations. Association for Computational Linguistics, Melbourne, Australia, pp 116–121.
Karthikeyan, K., Wang, Z., & Mayhew, S. et al. (2020). Cross-lingual ability of multilingual BERT: an empirical study. In: 8th International Conference on Learning Representations.
Kepler, F., Trénous, J., & Treviso, M. et al. (2019a). Unbabel’s participation in the WMT19 translation quality estimation shared task. In: Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2). Association for Computational Linguistics, Florence, Italy, pp 78–84.
Kepler, F., Trénous, J., & Treviso, M. et al. (2019b). OpenKiwi: An open source framework for quality estimation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp 117–122.
Kim, H., Jung, H. Y., Kwon, H., et al. (2017). Predictor-Estimator: neural quality estimation based on target word prediction for machine translation. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 17(1), 1–22.
Google Scholar
Kim, H., Lim, J.H., & Kim, H.K. et al. (2019). QE BERT: bilingual BERT using multi-task learning for neural quality estimation. In: Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2), pp 85–89.
Kingma, D.P., Ba, J. (2015). Adam: A method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR.
Koehn, P., Hoang, H., & Birch, A. et al. (2007). Moses: Open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. Association for Computational Linguistics, Prague, Czech Republic, pp 177–180.
Koehn, P., Chaudhary, V., & El-Kishky, A. et al. (2020). Findings of the WMT 2020 shared task on parallel corpus filtering and alignment. In: Proceedings of the Fifth Conference on Machine Translation. Association for Computational Linguistics, Online, pp 726–742.
Kulshreshtha, S., Redondo Garcia, J.L., & Chang, C.Y. (2020). Cross-lingual alignment methods for multilingual BERT: A comparative study. In: Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, pp 933–942.
Kunchukuttan, A., Mehta, P., & Bhattacharyya, P. (2018). The IIT Bombay English-Hindi parallel corpus. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan.
Lee, D. (2020). Two-phase cross-lingual language model fine-tuning for machine translation quality estimation. In: Proceedings of the Fifth Conference on Machine Translation, pp 1024–1028.
Liu, Q., McCarthy, D., Vulić, I. et al. (2019). Investigating cross-lingual alignment methods for contextualized embeddings with token-level evaluation. In: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, Hong Kong, China, pp 33–43.
Moura, J., Vera, M., & van Stigt, D. et al. (2020). IST-Unbabel participation in the wmt20 quality estimation shared task. In: Proceedings of the Fifth Conference on Machine Translation, pp 1029–1036.
Och, F. J., & Ney, H. (2003). A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1), 19–51.
Article Google Scholar
Papineni, K., Roukos, S., & Ward, T. et al. (2002). BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311–318.
Ranasinghe, T., Orǎsan, C., & Mitkov, R. (2020). Transquest: Translation quality estimation with cross-lingual transformers. In: Proceedings of the 28th International Conference on Computational Linguistics, pp 5070–5081.
Sabet, M. J., Dufter, P., Yvon, F., et al. (2020). Simalign: High quality word alignments without parallel training data using static and contextualized embeddings. Findings of the Association for Computational Linguistics: EMNLP, 2020, 1627–1643.
Google Scholar
Snover, M., Dorr, B., Schwartz, R. et al. (2006). A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, pp 223–231.
Specia, L., Turchi, M., & Cancedda, N. et al. (2009). Estimating the sentence-level quality of machine translation systems. In: Proceedings of the 13th annual conference of the European association for machine translation.
Specia, L., Shah, K., De Souza, J.G. et al (2013). QuEst - A translation quality estimation framework. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp 79–84.
Specia, L., Blain, F., & Fomicheva, M. et al (2020). Findings of the WMT 2020 shared task on quality estimation. In: Proceedings of the Fifth Conference on Machine Translation. Association for Computational Linguistics, Online, pp 743–764.
Specia, L., Blain, F., & Fomicheva, M. et al (2021). Findings of the WMT 2021 shared task on quality estimation. In: Proceedings of the Sixth Conference on Machine Translation. Association for Computational Linguistics, pp 684–725.
Tavakoli, L., & Faili, H. (2014). Phrase alignments in parallel corpus using bootstrap** approach. International Journal of Information and Communication Technology Research.
Tuan, Y.L., El-Kishky, A., & Renduchintala, A. et al (2021). Quality estimation without human-labeled data. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp 619–625.
Wang, J., Fan, K., & Li, B. et al (2018). Alibaba submission for wmt18 quality estimation task. In: Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pp 809–815.
Wang, J., Wang, K., & Chen, B. et al (2021). QEMind: Alibaba’s submission to the wmt21 quality estimation shared task. In: Proceedings of the Sixth Conference on Machine Translation. Association for Computational Linguistics, pp 948–954.
Wang, Y., Che, W., & Guo, J. et al (2019). Cross-lingual BERT transformation for zero-shot dependency parsing. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 5721–5727.
Wu, S., & Dredze, M. (2019). Beto, Bentz, Becas: The surprising cross-lingual effectiveness of BERT. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp 833–844.
Wu, S., & Dredze, M. (2020). Do explicit alignments robustly improve multilingual encoders? In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, pp 4471–4482.
Zerva, C., van Stigt, D., & Rei, R. et al (2021). IST-Unbabel 2021 submission for the quality estimation shared task. In: Proceedings of the Sixth Conference on Machine Translation. Association for Computational Linguistics, Online, pp 961–972.
Zerva, C., Blain, F., Rei, R. et al (2022). Findings of the WMT 2022 shared task on quality estimation. In: Proceedings of the Seventh Conference on Machine Translation (WMT). Association for Computational Linguistics, pp 69–99.
Zhang, T., Kishore, V., & Wu, F. et al (2020). BERTScore: Evaluating text generation with BERT. In: International Conference on Learning Representations.
Zhou, L., Ding, L., & Takeda, K. (2020). Zero-shot translation quality estimation with explicit cross-lingual patterns. In: Proceedings of the Fifth Conference on Machine Translation. Association for Computational Linguistics, Online, pp 1068–1074.

Download references

Acknowledgements

We want to acknowledge the partial support from Institute for Research in Fundamental Sciences (IPM) by the grant number CS1403-04-192. We also want to acknowledge the partial support of Iran National Science Foundation (INSF), under grant no 4002438.

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, College of Engineering, University of Tehran, Tehran, Iran
Fatemeh Azadi, Heshaam Faili & Mohammad Javad Dousti
School of Computer Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
Heshaam Faili

Authors

Fatemeh Azadi
View author publications
You can also search for this author in PubMed Google Scholar
Heshaam Faili
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Javad Dousti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heshaam Faili.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Azadi, F., Faili, H. & Dousti, M.J. Mismatching-aware unsupervised translation quality estimation for low-resource languages. Lang Resources & Evaluation (2024). https://doi.org/10.1007/s10579-024-09727-x

Download citation

Accepted: 30 January 2024
Published: 05 May 2024
DOI: https://doi.org/10.1007/s10579-024-09727-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Mismatching-aware unsupervised translation quality estimation for low-resource languages

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Unsupervised Machine Translation Quality Estimation in Black-Box Setting

Towards Making the Most of LLM for Translation Quality Estimation

Improved Quality Estimation of Machine Translation with Pre-trained Language Representation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Mismatching-aware unsupervised translation quality estimation for low-resource languages

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Unsupervised Machine Translation Quality Estimation in Black-Box Setting

Towards Making the Most of LLM for Translation Quality Estimation

Improved Quality Estimation of Machine Translation with Pre-trained Language Representation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation