Log in

EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training

  • Research Article
  • Published:
Machine Intelligence Research Aims and scope Submit manuscript

Abstract

Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. X. Han, Z. Y. Zhang, N. Ding, Y. X. Gu, X. Liu, Y. Q. Huo, J. Z. Qiu, Y. Yao, A. Zhang, L. Zhang, W. T. Han, M. L. Huang, Q. **, Y. Y. Lan, Y. Liu, Z. Y. Liu, Z. W. Lu, X. P. Qiu, R. H. Song, J. Tang, J. R. Wen, J. H. Yuan, W. X. Zhao, J. Zhu. Pre-trained models: Past, present and future. AI Open, vol. 2, pp. 225–250, 2021. DOI: https://doi.org/10.1016/j.aiopen.2021.08.002.

    Article  Google Scholar 

  2. T. X. Sun, X. Y. Liu, X. P. Qiu, X. J. Huang. Paradigm shift in natural language processing. Machine Intelligence Research, vol. 19, no. 3, pp. 169–183, 2022. DOI: https://doi.org/10.1007/s11633-022-1331-6.

    Article  Google Scholar 

  3. Y. Z. Zhang, S. Q. Sun, M. Galley, Y. C. Chen, C. Brockett, X. Gao, J. F. Gao, J. J. Liu, B. Dolan. DIALOGPT: Large-scale generative pre-training for conversational response generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 270–278, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-demos.30.

  4. D. Adiwardana, M. T. Luong, D. R. So, J. Hall, N. Fiedel, R. Thoppilan, Z. Yang, A. Kulshreshtha, G. Nemade, Y. F. Lu, Q. V. Le. Towards a human-like open-domain chatbot. [Online], Available: https://arxiv.org/abs/2001.09977, 2020.

  5. S. Roller, E. Dinan, N. Goyal, D. Ju, M. Williamson, Y. H. Liu, J. Xu, M. Ott, E. M. Smith, Y. L. Boureau, J. Weston. Recipes for building an open-domain chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 300–325, 2021. DOI: https://doi.org/10.18653/v1/2021.eaclmain.24.

  6. H. Zhou, P. Ke, Z. Zhang, Y. X. Gu, Y. H. Zheng, C. J. Zheng, Y. D. Wang, C. H. Wu, H. Sun, X. C. Yang, B. S. Wen, X. Y. Zhu, M. Huang, J. Tang. EVA: An open-domain Chinese dialogue system with large-scale generative pre-training. [Online], Available: https://arxiv.org/abs/2108.01547, 2021.

  7. S. Q. Bao, H. He, F. Wang, H. Wu, H. F. Wang, W. Q. Wu, Z. Guo, Z. B. Liu, X. C. Xu. PLATO-2: Towards building an open-domain chatbot via curriculum learning. In Proceedings of the Findings of the Association for Computational Linguistics, pp.2513–2525, 2021. DOI: https://doi.org/10.18653/v1/2021.findings-acl.222.

  8. S. Q. Bao, H. He, F. Wang, H. Wu, H. F. Wang, W. Q. Wu, Z. H. Wu, Z. Guo, H. Lu, X. X. Huang, X. Tian, X. C. Xu, Y. Z. Lin, Z. Y. Niu. PLATO-XL: Exploring the large-scale pre-training of dialogue generation. [Online], Available: https://arxiv.org/abs/2109.09519, 2021.

  9. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever. Language Models are Unsupervised Multitask Learners, OpenAI Technical Report, San Francisco, USA, [Online], Available: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf, 2019.

  10. J. Devlin, M. W. Chang, K. Lee, K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, USA, pp. 4171–4186, 2019. DOI: https://doi.org/10.18653/v1/N19-1423.

  11. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Q. Zhou, W. Li, P. J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, vol. 21, no. 1, Article number 140, 2020.

  12. A. Radford, K. Narasimhan, T. Salimans, I. Sutskever. Improving Language Understanding By Generative Pre-training, OpenAI Technical Report, San Francisco, USA, [Online], Available: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf, 2018.

  13. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Article number 159, 2020.

  14. Z. L. Yang, Z. H. Dai, Y. M. Yang, J. Carbonell, R. Salakhutdinov, Q. V. Le. XLNet: Generalized autoregressive pretraining for language understanding. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 517, 2019.

  15. M. Lewis, Y. H. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, L. Zettlemoyer. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.703.

  16. J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, D. Amodei. Scaling laws for neural language models. [Online], Available: https://arxiv.org/abs/2001.08361, 2020.

  17. Y. H. Liu, M. Ott, N. Goyal, J. F. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach. [Online], Available: https://arxiv.org/abs/1907.11692, 2019.

  18. Z. Y. Zhang, Y. X. Gu, X. Han, S. Q. Chen, C. J. **ao, Z. B. Sun, Y. Yao, F. C. Qi, J. Guan, P. Ke, Y. Z. Cai, G. Y. Zeng, Z. X. Tan, Z. Y. Liu, M. L. Huang, W. T. Han, Y. Liu, X. Y. Zhu, M. S. Sun. CPM-2: Large-scale cost-effective pre-trained language models. AI Open, vol. 2, pp. 216–224, 2021. DOI: https://doi.org/10.1016/j.aiopen.2021.12.003.

    Article  Google Scholar 

  19. Z. Y. Zhang, X. Han, H. Zhou, P. Ke, Y. X. Gu, D. M. Ye, Y. J. Qin, Y. S. Su, H. Z. Ji, J. Guan, F. C. Qi, X. Z. Wang, Y. N. Zheng, G. Y. Zeng, H. Q. Cao, S. Q. Chen, D. X. Li, Z. B. Sun, Z. Y. Liu, M. L. Huang, W. T. Han, J. Tang, J. Z. Li, X. Y. Zhu, M. S. Sun. CPM: A large-scale generative Chinese pre-trained language model. AI Open, vol. 2, pp. 93–99, 2021. DOI: https://doi.org/10.1016/j.aiopen.2021.07.001.

    Article  Google Scholar 

  20. W. Zeng, X. Z. Ren, T. Su, H. Wang, Y. Liao, Z. W. Wang, X. Jiang, Z. Z. Yang, K. S. Wang, X. D. Zhang, C. Li, Z. Y. Gong, Y. F. Yao, X. J. Huang, J. Wang, J. F. Yu, Q. Guo, Y. Yu, Y. Zhang, J. Wang, H. T. Tao, D. S. Yan, Z. X. Yi, F. Peng, F. Q. Jiang, H. Zhang, L. F. Deng, Y. H. Zhang, Z. Lin, C. Zhang, S. J. Zhang, M. Y. Guo, S. Z. Gu, G. J. Fan, Y. W. Wang, X. F. **, Q. Liu, Y. H. Tian. Pangu-α: Large-scale autoregressive pretrained Chinese language models with auto-parallel computation. [Online], Available: https://arxiv.org/abs/2104.12369, 2021.

  21. S. H. Wu, X. D. Zhao, T. Yu, R. G. Zhang, C. Shen, H. L. Liu, F. Li, H. Zhu, J. G. Luo, L. Xu, X. W. Zhang. Yuan 1.0: Large-scale pre-trained language model in zero-shot and few-shot learning. [Online], Available: https://arxiv.org/abs/2110.04725, 2021.

  22. Z. S. Zhang, H. Q. Zhang, K. M. Chen, Y. H. Guo, J. Y. Hua, Y. L. Wang, M. Zhou. Mengzi: Towards lightweight yet ingenious pre-trained models for Chinese. [Online], Available: https://arxiv.org/abs/2110.06696, 2021.

  23. Y. F. Shao, Z. C. Geng, Y. T. Liu, J. Q. Dai, H. Yan, F. Yang, L. Zhe, H. J. Bao, X. P. Qiu. CPT: A pre-trained unbalanced transformer for both Chinese language understanding and generation. [Online], Available: https://arxiv.org/abs/2109.05729, 2021.

  24. S. H. Wang, Y. Sun, Y. **ang, Z. H. Wu, S. Y. Ding, W. B. Gong, S. K. Feng, J. Y. Shang, Y. B. Zhao, C. Pang, J. X. Liu, X. Y. Chen, Y. X. Lu, W. X. Liu, X. Wang, Y. F. Bai, Q. L. Chen, L. Zhao, S. Y. Li, P. Sun, D. H. Yu, Y. J. Ma, H. Tian, H. Wu, T. Wu, W. Zeng, G. Li, W. Gao, H. F. Wang. ERNIE 3.0 Titan: Exploring larger-scale knowledge enhanced pre-training for language understanding and generation. [Online], Available: https://arxiv.org/abs/2112.12731, 2021.

  25. R. Thoppilan, D. De Freitas, J. Hall, N. Shazeer, A. Kulshreshtha, H. T. Cheng, A. **, T. Bos, L. Baker, Y. Du, Y. G. Li, H. Lee, H. S. Zheng, A. Ghafouri, M. Menegali, Y. P. Huang, M. Krikun, D. Lepikhin, J. Qin, D. H. Chen, Y. Z. Xu, Z. F. Chen, A. Roberts, M. Bosma, V. Zhao, Y. Q. Zhou, C.C. Chang, I. Krivokon, W. Rusch, M. Pickett, P. Srinivasan, L. Man, K. Meier-Hellstern, M. R. Morris, T. Doshi, R. D. Santos, T. Duke, J. Soraker, B. Zevenbergen, V. Prabhakaran, M. Diaz, B. Hutchinson, K. Olson, A. Molina, E. Hoffman-John, J. Lee, L. Aroyo, R. Rajakumar, A. Butryna, M. Lamm, V. Kuzmina, J. Fenton, A. Cohen, R. Bernstein, R. Kurzweil, B. Aguera-Arcas, C. Cui, M. Croak, E. Chi, Q. Le. LaMDA: Language models for dialog applications. [Online], Available: https://arxiv.org/abs/2201.08239, 2022.

  26. S. Z. Zhang, E. Dinan, J. Urbanek, A. Szlam, D. Kiela, J. Weston. Personalizing dialogue agents: I have a dog, do you have pets too? In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, pp. 2204–2213, 2018. DOI: https://doi.org/10.18653/v1/P18-1205.

  27. E. Dinan, S. Roller, K. Shuster, A. Fan, M. Auli, J. Weston. Wizard of wikipedia: Knowledge-powered conversational agents. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.

  28. H. Rashkin, E. M. Smith, M. Li, Y. L. Boureau. Towards empathetic open-domain conversation models: A new benchmark and dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp. 5370–5381, 2019. DOI: https://doi.org/10.18653/v1/P19-1534.

    Chapter  Google Scholar 

  29. Y. Wang, P. Ke, Y. Zheng, K. Huang, Y. Jiang, X. Zhu, M. Huang. A large-scale Chinese short-text conversation dataset. In Proceedings of the 9th CCF International Conference on Natural Language Processing and Chinese Computing, Springer, Zhengzhou, China, pp. 91–103, 2020. DOI: https://doi.org/10.1007/978-3-030-60450-9_8.

    Chapter  Google Scholar 

  30. S. Q. Bao, H. He, F. Wang, H. Wu, H. F. Wang. PLATO: Pre-trained dialogue generation model with discrete latent variable. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 85–96, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.9.

  31. M. Chen, R. X. Liu, L. Shen, S. Z. Yuan, J. Y. Zhou, Y. Z. Wu, X. D. He, B. W. Zhou. The JDDC corpus: A large-scale multi-turn Chinese dialogue dataset for E-commerce customer service. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, pp. 459–466, 2020.

  32. K. Lee, D. Ippolito, A. Nystrom, C. Y. Zhang, D. Eck, C. Callison-Burch, N. Carlini. Deduplicating training data makes language models better. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, pp. 8424–8445, 2022. DOI: https://doi.org/10.18653/v1/2022.acl-long.577.

  33. I. Sutskever, O. Vinyals, Q. V. Le. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 3104–3112, 2014.

  34. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.

  35. S. Rajbhandari, J. Rasley, O. Ruwase, Y. X. He. ZeRO: Memory optimizations toward training trillion parameter models. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE, Atlanta, USA, Article number 20, 2020. DOI: https://doi.org/10.1109/SC41405.2020.00024.

    Google Scholar 

  36. J. Rasley, S. Rajbhandari, O. Ruwase, Y. X. He. Deep-Speed: System optimizations enable training deep learning models with over 100 billion parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3505–3506, 2020. DOI: https://doi.org/10.1145/3394486.3406703.

  37. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, R. Hadsell. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences of the United States of America, vol. 114, no. 13, pp. 3521–3526, 2017. DOI: https://doi.org/10.1073/pnas.1611835114.

    Article  MathSciNet  MATH  Google Scholar 

  38. S. Panigrahi, A. Nanda, T. Swarnkar. A survey on transfer learning. In Proceedings of International Conference on Innovative Computing and Communication 2019: Intelligent and Cloud Computing, Springer, Singapore, pp. 781–789, 2021. DOI: https://doi.org/10.1007/978-981-15-5971-6_83.

    Google Scholar 

  39. A. Holtzman, J. Buys, L. Du, M. Forbes, Y. Choi. The curious case of neural text degeneration. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.

  40. A. Graves. Sequence transduction with recurrent neural networks. [Online], Available: https://arxiv.org/abs/1211.3711, 2012.

  41. M. Li, J. Weston, S. Roller. ACUTE-EVAL: Improved dialogue evaluation with optimized questions and multi-turn comparisons. [Online], Available: https://arxiv.org/abs/1909.03087, 2019.

  42. Y. X. Nie, M. Williamson, M. Bansal, D. Kiela, J. Weston. I like fish, especially dolphins: Addressing contradictions in dialogue modeling. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1699–1713, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-long.134.

  43. M. Li, S. Roller, I. Kulikov, S. Welleck, Y. L. Boureau, K. Cho, J. Weston. Don’t say that! Making inconsistent dialogue unlikely with unlikelihood training. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4715–4728, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.428.

  44. H. Zhou, C. J. Zheng, K. L. Huang, M. L. Huang, X. Y. Zhu. KdConv: A Chinese multi-domain dialogue dataset towards multi-turn knowledge-driven conversation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7098–7108, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.635.

  45. P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W. T. Yih, T. Rocktäschel, S. Riedel, D. Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 793, 2020.

  46. H. Sun, G. X. Xu, J. W. Deng, J. L. Cheng, C. J. Zheng, H. Zhou, N. Y. Peng, X. Y. Zhu, M. L. Huang. On the safety of conversational models: Taxonomy, dataset, and benchmark. In Proceedings of the Findings of the Association for Computational Linguistics, Dublin, Ireland, pp. 3906–3923, 2022. DOI: https://doi.org/10.18653/v1/2022.findings-acl.308.

  47. P. Lison, J. Tiedemann. OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation, Portorož, Slovenia, pp. 923–929, 2016.

  48. J. Guan, Z. E. Feng, Y. M. Chen, R. L. He, X. X. Mao, C. J. Fan, M. L. Huang. LOT: A story-centric benchmark for evaluating Chinese long text understanding and generation. Transactions of the Association for Computational Linguistics, vol. 10, pp. 434–451, 2022. DOI: https://doi.org/10.1162/tacl_a_00469.

    Article  Google Scholar 

  49. W. Q. Wu, Z. Guo, X. Y. Zhou, H. Wu, X. Y. Zhang, R. Z. Lian, H. F. Wang. Proactive human-machine conversation with explicit conversation goal. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 3794–3804, 2019. DOI: https://doi.org/10.18653/v1/P19-1369.

  50. Z. M. Liu, H. F. Wang, Z. Y. Niu, H. Wu, W. X. Che, T. Liu. Towards conversational recommendation over multitype dialogs. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1036–1049, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.98.

  51. X. Y. Wang, C. Li, J. Q. Zhao, D. Yu. NaturalConv: A Chinese dialogue dataset towards multi-turn topic-driven conversation. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 14006–14014, 2021. DOI: https://doi.org/10.1609/aaai.v35i16.17649.

  52. D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. [Online], Available: https://arxiv.org/abs/1412.6980, 2015.

Download references

Acknowledgements

This paper was supported by the 2030 National Key AI Program of China (No. 2021ZD0113304), the National Science Foundation for Distinguished Young Scholars (No. 62125604), and the NSFC projects (Key project with No. 61936010 and regular project with No. 61876096), the Guoqiang Institute of Tsinghua University, China (Nos. 2019GQG1 and 2020GQG0005), and Tsinghua-Toyota Joint Research Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minlie Huang.

Additional information

Yuxian Gu received the B. Eng. degree in computer science and technology from Tsinghua University, China in 2021. Currently, he is a Ph. D. degree candidate in computer science and technology at Department of Computer Scence and Technology, Tsinghua University, China.

His research interests include natural language processing, pre-trained language models, and dialogue systems.

Jiaxin Wen received the B. Eng. degree in computer science and technology from Tsinghua University, China in 2022. He is currently a master student in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include pre-trained language models and dialogue systems.

Hao Sun received the B. Eng. degree in computer science and technology from Shanghai Jiao Tong University, China in 2016. He is a master student in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation and dialogue systems.

Yi Song received the B. Eng. degree in computer science and technology from Bei**g Institute of Technology, China in 2021. He is currently a master student in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation and dialogue systems.

Pei Ke received the Ph. D. degree in computer science and technology from Tsinghua University, China in 2022. He is currently a postdoctoral researcher at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation, dialogue systems and sentiment analysis.

Chujie Zheng received the B. Sc. degree in physics from Tsinghua University, China in 2020. He is a Ph. D. degree candidate in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation and dialogue systems.

Zheng Zhang received the B. Eng. and Ph. D. degrees in computer science and technology from Department of Computer Science and Technology, Tsinghua University, China in 2015 and 2021, respectively. He is now a postdoctoral researcher at Tsinghua University, China.

His research interests include natural language processing, dialogue systems and text generation.

Jianzhu Yao is an undergraduate student in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation and dialogue systems.

Lei Liu received the M. Sc. degree in computer science from Central China Normal University, China in 2019. He is currently a Ph. D. candidate in the Graduate Program of Electrical Engineering and Computer Science, York University, Canada.

His research interests include dialogue systems and natural language generation.

**aoyan Zhu received the B. Sc. degree in computer science and technology from University of Science and Technology Bei**g, China in 1982, the M. Sc. degree in computer science and technology from Kobe University, Japan in 1987, and the Ph. D. degree in computer science and technology from Nagoya Institute of Technology, Japan in 1990. She is currently a professor with Department of Computer Science and Technology, Tsinghua University, China. She has authored more than 100 peer-reviewed articles in leading international conferences (SIGKDD, IJCAI, AAAI, ACL) and journals (TOIS, Bioinformatics, Genome Biology).

Her research interests include intelligent information processing, machine learning, natural language processing, question and answering system and Bioinformatics.

Minlie Huang received his Ph.D. degree in engineering of physics from Tsinghua University, China in 2006. He is currently an associate professor with Department of Computer Science and Technology, Tsinghua University, China. He has published more than 60 papers in premier conferences and journals (ACL, EMNLP, AAAI, IJCAI, WWW, SIGIR, etc.). His work on emotional chatting machines was reported by MIT Technology Review, the Guardian, and many other mass media. He serves as standing reviewer for TACL, area chairs for ACL 2020/2016, EMNLP 2019/2014/2011, and Senior PC members for AAAI 2017–2020 and IJCAI 2017–2020, and reviewers for TASLP, TKDE, TOIS, TPAMI, etc. He is a Nominee of ACL 2019 Best Demo Papers, the Recipient of IJCAI 2018 Distinguished Paper Award, CCL 2018 Best Demo Award, NLPCC 2015 Best Paper Award, Hanvon Youngth Innovation Award in 2018, and Wuwenjun AI Award in 2019. He was supported by a NSFC key project, several NSFC regular projects, and many IT companies.

His research interests include natural language processing, particularly in dialog systems, reading comprehension, and sentiment analysis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, Y., Wen, J., Sun, H. et al. EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training. Mach. Intell. Res. 20, 207–219 (2023). https://doi.org/10.1007/s11633-022-1387-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-022-1387-3

Keywords

Navigation