Log in

Research on the LSTM Mongolian and Chinese machine translation based on morpheme encoding

  • S.I. : Brain- Inspired computing and Machine learning for Brain Health
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The neural machine translation model based on long short-term memory (LSTM) has become the mainstream in machine translation with its unique coding–decoding structure and semantic mining features. However, there are few studies on the Mongolian and Chinese neural machine translation combined with LSTM. This paper mainly studies the preprocessing of Mongolian and Chinese bilingual corpus and the construction of the LSTM model of Mongolian morpheme coding. In the corpus preprocessing stage, this paper presents a hybrid algorithm for the construction of word segmentation modules. The sequence that has not been annotated is treated semantically and labeled by a combination of gated recurrent unit and conditional random field. In order to learn more grammar and semantic knowledge from Mongolian corpus, in the model construction stage, this paper presents the LSTM neural network model based on morpheme coding to construct the encoder. This paper also constructs the LSTM neural network decoder to predict the Chinese decode. Experimental comparisons of sentences of different lengths according to the construction model show that the model has improved translation performance in dealing with long-term dependence problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Sreelekha S, Bhattacharyya P, Jha SK et al (2016) A survey report on evolution of machine translation. Int J Control Theory Appl 9(33):233–240

    Google Scholar 

  2. Kituku B, Muchemi L, Nganga W (2016) A review on machine translation approaches. Indones J Electr Eng Comput Sci 1(1):182–190

    Article  Google Scholar 

  3. Singh SP, Kumar A, Darbari H (2017) Machine translation using deep learning: an overview. Int Conf Comput Commun Electron IEEE 2:162–167

    Google Scholar 

  4. Cho K, VanMerrienboer B, Bahdanau D et al (2014) On the properties of neural machine translation: encoder–decoder approaches. Comput Sci 1:103–111

    Google Scholar 

  5. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Comput Sci 4:1–15

    Google Scholar 

  6. Wu Y, Schuster M, Chen Z et al (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. ar**v:1609.08144

  7. Klein G, Kim Y, Deng Y, et al (2017) OpenNMT: open-source toolkit for neural machine translation. In: ACL 2017 system demonstrations, pp 67–72

  8. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw Off J Int Neural Netw Soc 18(5):602–610

    Article  Google Scholar 

  9. Chung J, Gulcehre C, Cho K et al (2015) Gated feedback recurrent neural networks. Comput Sci 2067–2075

  10. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. Comput Sci 5(1):36

    Google Scholar 

  11. Bahdanau D, Chorowski J, Serdyuk D et al (2016) End-to-end attention-based large vocabulary speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processin. IEEE, pp. 4945–4949

  12. Ren Zhihui Xu, Haoyu Feng Songlin (2017) Sequence labeling chinese word segmentation method based on LSTM networks. Appl Res Comput 34(5):1321–1324

    Google Scholar 

  13. Huang Jiyang (2016) Chinese word segmentation analysis based on bidirectional LSTMN recurrent neural network. Nan**g University, Jiangsu

    Google Scholar 

  14. Zoph B, Knight K (2016) Multi-source neural translation. In: The conference of the North American chapter of the association for computational linguistics[C], human language technologies, pp 30–34

  15. Barone AVM (2016) Low-rank passthrough neural networks. ar**v:1603.03116v3

  16. Cui Y, Wang S, Li J (2015) LSTM neural reordering feature for statistical machine translation. Comput Sci (2):977–982

  17. **a Y, Wang S (2014) A research on constructing mongolian tree bank based on phrase structure grammar. In: 2014 international conference on progress in informatics and computing (PIC), IEEE, pp 51–54

  18. Wu J, Hou H, Shen Z et al (2016) Adapting attention-based neural network to low-resource Mongolian–Chinese machine translation. Natural Lang Underst Intell Appl 470–480

Download references

Acknowledgements

This work was financially supported by the Natural Science Foundation of Inner Mongolia (2018MS06021, 2016MS0605), the Foundation of Autonomous regional civil committee of Inner Mongolia (MW-2017-MGYWXXH-03), the Inner Mongolia Scientific and technological innovation guide reward funds project: Facilities Agricultural IOT key equipment and system development and industrialization demonstration, the Inner Mongolia Science and Technology Plan Project (201502015).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi La Su.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qing-dao-er-ji, R., Su, Y.L. & Liu, W.W. Research on the LSTM Mongolian and Chinese machine translation based on morpheme encoding. Neural Comput & Applic 32, 41–49 (2020). https://doi.org/10.1007/s00521-018-3741-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3741-5

Keywords

Navigation