EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training

Gu, Yuxian; Wen, Jiaxin; Sun, Hao; Song, Yi; Ke, Pei; Zheng, Chujie; Zhang, Zheng; Yao, Jianzhu; Liu, Lei; Zhu, **aoyan; Huang, Minlie

doi:10.1007/s11633-022-1387-3

EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training

Research Article
Published: 18 February 2023

Volume 20, pages 207–219, (2023)
Cite this article

Machine Intelligence Research Aims and scope Submit manuscript

Yuxian Gu ORCID: orcid.org/0000-0002-4607-7025^1,2^na1,
Jiaxin Wen^1,2^na1,
Hao Sun^1,2^na1,
Yi Song^1,2,
Pei Ke^1,2,
Chujie Zheng^1,2,
Zheng Zhang^1,2,
Jianzhu Yao²,
Lei Liu³,
**aoyan Zhu^1,2 &
…
Minlie Huang ORCID: orcid.org/0000-0001-7111-1849^1,2

409 Accesses
2 Altmetric
Explore all metrics

Abstract

Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recent advances in deep learning based dialogue systems: a systematic survey

Article 20 August 2022

Privacy-Preserving Medical Dialogue Generation Based on Federated Learning

Zero-Shot Deployment for Cross-Lingual Dialogue System

References

X. Han, Z. Y. Zhang, N. Ding, Y. X. Gu, X. Liu, Y. Q. Huo, J. Z. Qiu, Y. Yao, A. Zhang, L. Zhang, W. T. Han, M. L. Huang, Q. **, Y. Y. Lan, Y. Liu, Z. Y. Liu, Z. W. Lu, X. P. Qiu, R. H. Song, J. Tang, J. R. Wen, J. H. Yuan, W. X. Zhao, J. Zhu. Pre-trained models: Past, present and future. AI Open, vol. 2, pp. 225–250, 2021. DOI: https://doi.org/10.1016/j.aiopen.2021.08.002.
Article Google Scholar
T. X. Sun, X. Y. Liu, X. P. Qiu, X. J. Huang. Paradigm shift in natural language processing. Machine Intelligence Research, vol. 19, no. 3, pp. 169–183, 2022. DOI: https://doi.org/10.1007/s11633-022-1331-6.
Article Google Scholar
Y. Z. Zhang, S. Q. Sun, M. Galley, Y. C. Chen, C. Brockett, X. Gao, J. F. Gao, J. J. Liu, B. Dolan. DIALOGPT: Large-scale generative pre-training for conversational response generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 270–278, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-demos.30.
D. Adiwardana, M. T. Luong, D. R. So, J. Hall, N. Fiedel, R. Thoppilan, Z. Yang, A. Kulshreshtha, G. Nemade, Y. F. Lu, Q. V. Le. Towards a human-like open-domain chatbot. [Online], Available: https://arxiv.org/abs/2001.09977, 2020.
S. Roller, E. Dinan, N. Goyal, D. Ju, M. Williamson, Y. H. Liu, J. Xu, M. Ott, E. M. Smith, Y. L. Boureau, J. Weston. Recipes for building an open-domain chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 300–325, 2021. DOI: https://doi.org/10.18653/v1/2021.eaclmain.24.
H. Zhou, P. Ke, Z. Zhang, Y. X. Gu, Y. H. Zheng, C. J. Zheng, Y. D. Wang, C. H. Wu, H. Sun, X. C. Yang, B. S. Wen, X. Y. Zhu, M. Huang, J. Tang. EVA: An open-domain Chinese dialogue system with large-scale generative pre-training. [Online], Available: https://arxiv.org/abs/2108.01547, 2021.
S. Q. Bao, H. He, F. Wang, H. Wu, H. F. Wang, W. Q. Wu, Z. Guo, Z. B. Liu, X. C. Xu. PLATO-2: Towards building an open-domain chatbot via curriculum learning. In Proceedings of the Findings of the Association for Computational Linguistics, pp.2513–2525, 2021. DOI: https://doi.org/10.18653/v1/2021.findings-acl.222.
S. Q. Bao, H. He, F. Wang, H. Wu, H. F. Wang, W. Q. Wu, Z. H. Wu, Z. Guo, H. Lu, X. X. Huang, X. Tian, X. C. Xu, Y. Z. Lin, Z. Y. Niu. PLATO-XL: Exploring the large-scale pre-training of dialogue generation. [Online], Available: https://arxiv.org/abs/2109.09519, 2021.
A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever. Language Models are Unsupervised Multitask Learners, OpenAI Technical Report, San Francisco, USA, [Online], Available: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf, 2019.
J. Devlin, M. W. Chang, K. Lee, K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, USA, pp. 4171–4186, 2019. DOI: https://doi.org/10.18653/v1/N19-1423.
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Q. Zhou, W. Li, P. J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, vol. 21, no. 1, Article number 140, 2020.
A. Radford, K. Narasimhan, T. Salimans, I. Sutskever. Improving Language Understanding By Generative Pre-training, OpenAI Technical Report, San Francisco, USA, [Online], Available: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf, 2018.
T. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Article number 159, 2020.
Z. L. Yang, Z. H. Dai, Y. M. Yang, J. Carbonell, R. Salakhutdinov, Q. V. Le. XLNet: Generalized autoregressive pretraining for language understanding. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 517, 2019.
M. Lewis, Y. H. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, L. Zettlemoyer. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.703.
J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, D. Amodei. Scaling laws for neural language models. [Online], Available: https://arxiv.org/abs/2001.08361, 2020.
Y. H. Liu, M. Ott, N. Goyal, J. F. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov. RoBERTa: A robustly optimized BERT pretraining approach. [Online], Available: https://arxiv.org/abs/1907.11692, 2019.
Z. Y. Zhang, Y. X. Gu, X. Han, S. Q. Chen, C. J. **ao, Z. B. Sun, Y. Yao, F. C. Qi, J. Guan, P. Ke, Y. Z. Cai, G. Y. Zeng, Z. X. Tan, Z. Y. Liu, M. L. Huang, W. T. Han, Y. Liu, X. Y. Zhu, M. S. Sun. CPM-2: Large-scale cost-effective pre-trained language models. AI Open, vol. 2, pp. 216–224, 2021. DOI: https://doi.org/10.1016/j.aiopen.2021.12.003.
Article Google Scholar
Z. Y. Zhang, X. Han, H. Zhou, P. Ke, Y. X. Gu, D. M. Ye, Y. J. Qin, Y. S. Su, H. Z. Ji, J. Guan, F. C. Qi, X. Z. Wang, Y. N. Zheng, G. Y. Zeng, H. Q. Cao, S. Q. Chen, D. X. Li, Z. B. Sun, Z. Y. Liu, M. L. Huang, W. T. Han, J. Tang, J. Z. Li, X. Y. Zhu, M. S. Sun. CPM: A large-scale generative Chinese pre-trained language model. AI Open, vol. 2, pp. 93–99, 2021. DOI: https://doi.org/10.1016/j.aiopen.2021.07.001.
Article Google Scholar
W. Zeng, X. Z. Ren, T. Su, H. Wang, Y. Liao, Z. W. Wang, X. Jiang, Z. Z. Yang, K. S. Wang, X. D. Zhang, C. Li, Z. Y. Gong, Y. F. Yao, X. J. Huang, J. Wang, J. F. Yu, Q. Guo, Y. Yu, Y. Zhang, J. Wang, H. T. Tao, D. S. Yan, Z. X. Yi, F. Peng, F. Q. Jiang, H. Zhang, L. F. Deng, Y. H. Zhang, Z. Lin, C. Zhang, S. J. Zhang, M. Y. Guo, S. Z. Gu, G. J. Fan, Y. W. Wang, X. F. **, Q. Liu, Y. H. Tian. Pangu-α: Large-scale autoregressive pretrained Chinese language models with auto-parallel computation. [Online], Available: https://arxiv.org/abs/2104.12369, 2021.
S. H. Wu, X. D. Zhao, T. Yu, R. G. Zhang, C. Shen, H. L. Liu, F. Li, H. Zhu, J. G. Luo, L. Xu, X. W. Zhang. Yuan 1.0: Large-scale pre-trained language model in zero-shot and few-shot learning. [Online], Available: https://arxiv.org/abs/2110.04725, 2021.
Z. S. Zhang, H. Q. Zhang, K. M. Chen, Y. H. Guo, J. Y. Hua, Y. L. Wang, M. Zhou. Mengzi: Towards lightweight yet ingenious pre-trained models for Chinese. [Online], Available: https://arxiv.org/abs/2110.06696, 2021.
Y. F. Shao, Z. C. Geng, Y. T. Liu, J. Q. Dai, H. Yan, F. Yang, L. Zhe, H. J. Bao, X. P. Qiu. CPT: A pre-trained unbalanced transformer for both Chinese language understanding and generation. [Online], Available: https://arxiv.org/abs/2109.05729, 2021.
S. H. Wang, Y. Sun, Y. **ang, Z. H. Wu, S. Y. Ding, W. B. Gong, S. K. Feng, J. Y. Shang, Y. B. Zhao, C. Pang, J. X. Liu, X. Y. Chen, Y. X. Lu, W. X. Liu, X. Wang, Y. F. Bai, Q. L. Chen, L. Zhao, S. Y. Li, P. Sun, D. H. Yu, Y. J. Ma, H. Tian, H. Wu, T. Wu, W. Zeng, G. Li, W. Gao, H. F. Wang. ERNIE 3.0 Titan: Exploring larger-scale knowledge enhanced pre-training for language understanding and generation. [Online], Available: https://arxiv.org/abs/2112.12731, 2021.
R. Thoppilan, D. De Freitas, J. Hall, N. Shazeer, A. Kulshreshtha, H. T. Cheng, A. **, T. Bos, L. Baker, Y. Du, Y. G. Li, H. Lee, H. S. Zheng, A. Ghafouri, M. Menegali, Y. P. Huang, M. Krikun, D. Lepikhin, J. Qin, D. H. Chen, Y. Z. Xu, Z. F. Chen, A. Roberts, M. Bosma, V. Zhao, Y. Q. Zhou, C.C. Chang, I. Krivokon, W. Rusch, M. Pickett, P. Srinivasan, L. Man, K. Meier-Hellstern, M. R. Morris, T. Doshi, R. D. Santos, T. Duke, J. Soraker, B. Zevenbergen, V. Prabhakaran, M. Diaz, B. Hutchinson, K. Olson, A. Molina, E. Hoffman-John, J. Lee, L. Aroyo, R. Rajakumar, A. Butryna, M. Lamm, V. Kuzmina, J. Fenton, A. Cohen, R. Bernstein, R. Kurzweil, B. Aguera-Arcas, C. Cui, M. Croak, E. Chi, Q. Le. LaMDA: Language models for dialog applications. [Online], Available: https://arxiv.org/abs/2201.08239, 2022.
S. Z. Zhang, E. Dinan, J. Urbanek, A. Szlam, D. Kiela, J. Weston. Personalizing dialogue agents: I have a dog, do you have pets too? In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, pp. 2204–2213, 2018. DOI: https://doi.org/10.18653/v1/P18-1205.
E. Dinan, S. Roller, K. Shuster, A. Fan, M. Auli, J. Weston. Wizard of wikipedia: Knowledge-powered conversational agents. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
H. Rashkin, E. M. Smith, M. Li, Y. L. Boureau. Towards empathetic open-domain conversation models: A new benchmark and dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, pp. 5370–5381, 2019. DOI: https://doi.org/10.18653/v1/P19-1534.
Chapter Google Scholar
Y. Wang, P. Ke, Y. Zheng, K. Huang, Y. Jiang, X. Zhu, M. Huang. A large-scale Chinese short-text conversation dataset. In Proceedings of the 9th CCF International Conference on Natural Language Processing and Chinese Computing, Springer, Zhengzhou, China, pp. 91–103, 2020. DOI: https://doi.org/10.1007/978-3-030-60450-9_8.
Chapter Google Scholar
S. Q. Bao, H. He, F. Wang, H. Wu, H. F. Wang. PLATO: Pre-trained dialogue generation model with discrete latent variable. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 85–96, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.9.
M. Chen, R. X. Liu, L. Shen, S. Z. Yuan, J. Y. Zhou, Y. Z. Wu, X. D. He, B. W. Zhou. The JDDC corpus: A large-scale multi-turn Chinese dialogue dataset for E-commerce customer service. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, pp. 459–466, 2020.
K. Lee, D. Ippolito, A. Nystrom, C. Y. Zhang, D. Eck, C. Callison-Burch, N. Carlini. Deduplicating training data makes language models better. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, pp. 8424–8445, 2022. DOI: https://doi.org/10.18653/v1/2022.acl-long.577.
I. Sutskever, O. Vinyals, Q. V. Le. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 3104–3112, 2014.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.
S. Rajbhandari, J. Rasley, O. Ruwase, Y. X. He. ZeRO: Memory optimizations toward training trillion parameter models. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE, Atlanta, USA, Article number 20, 2020. DOI: https://doi.org/10.1109/SC41405.2020.00024.
Google Scholar
J. Rasley, S. Rajbhandari, O. Ruwase, Y. X. He. Deep-Speed: System optimizations enable training deep learning models with over 100 billion parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3505–3506, 2020. DOI: https://doi.org/10.1145/3394486.3406703.
J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, R. Hadsell. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences of the United States of America, vol. 114, no. 13, pp. 3521–3526, 2017. DOI: https://doi.org/10.1073/pnas.1611835114.
Article MathSciNet MATH Google Scholar
S. Panigrahi, A. Nanda, T. Swarnkar. A survey on transfer learning. In Proceedings of International Conference on Innovative Computing and Communication 2019: Intelligent and Cloud Computing, Springer, Singapore, pp. 781–789, 2021. DOI: https://doi.org/10.1007/978-981-15-5971-6_83.
Google Scholar
A. Holtzman, J. Buys, L. Du, M. Forbes, Y. Choi. The curious case of neural text degeneration. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
A. Graves. Sequence transduction with recurrent neural networks. [Online], Available: https://arxiv.org/abs/1211.3711, 2012.
M. Li, J. Weston, S. Roller. ACUTE-EVAL: Improved dialogue evaluation with optimized questions and multi-turn comparisons. [Online], Available: https://arxiv.org/abs/1909.03087, 2019.
Y. X. Nie, M. Williamson, M. Bansal, D. Kiela, J. Weston. I like fish, especially dolphins: Addressing contradictions in dialogue modeling. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1699–1713, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-long.134.
M. Li, S. Roller, I. Kulikov, S. Welleck, Y. L. Boureau, K. Cho, J. Weston. Don’t say that! Making inconsistent dialogue unlikely with unlikelihood training. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4715–4728, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.428.
H. Zhou, C. J. Zheng, K. L. Huang, M. L. Huang, X. Y. Zhu. KdConv: A Chinese multi-domain dialogue dataset towards multi-turn knowledge-driven conversation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7098–7108, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.635.
P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W. T. Yih, T. Rocktäschel, S. Riedel, D. Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 793, 2020.
H. Sun, G. X. Xu, J. W. Deng, J. L. Cheng, C. J. Zheng, H. Zhou, N. Y. Peng, X. Y. Zhu, M. L. Huang. On the safety of conversational models: Taxonomy, dataset, and benchmark. In Proceedings of the Findings of the Association for Computational Linguistics, Dublin, Ireland, pp. 3906–3923, 2022. DOI: https://doi.org/10.18653/v1/2022.findings-acl.308.
P. Lison, J. Tiedemann. OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation, Portorož, Slovenia, pp. 923–929, 2016.
J. Guan, Z. E. Feng, Y. M. Chen, R. L. He, X. X. Mao, C. J. Fan, M. L. Huang. LOT: A story-centric benchmark for evaluating Chinese long text understanding and generation. Transactions of the Association for Computational Linguistics, vol. 10, pp. 434–451, 2022. DOI: https://doi.org/10.1162/tacl_a_00469.
Article Google Scholar
W. Q. Wu, Z. Guo, X. Y. Zhou, H. Wu, X. Y. Zhang, R. Z. Lian, H. F. Wang. Proactive human-machine conversation with explicit conversation goal. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 3794–3804, 2019. DOI: https://doi.org/10.18653/v1/P19-1369.
Z. M. Liu, H. F. Wang, Z. Y. Niu, H. Wu, W. X. Che, T. Liu. Towards conversational recommendation over multitype dialogs. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1036–1049, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.98.
X. Y. Wang, C. Li, J. Q. Zhao, D. Yu. NaturalConv: A Chinese dialogue dataset towards multi-turn topic-driven conversation. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, pp. 14006–14014, 2021. DOI: https://doi.org/10.1609/aaai.v35i16.17649.
D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. [Online], Available: https://arxiv.org/abs/1412.6980, 2015.

Download references

Acknowledgements

This paper was supported by the 2030 National Key AI Program of China (No. 2021ZD0113304), the National Science Foundation for Distinguished Young Scholars (No. 62125604), and the NSFC projects (Key project with No. 61936010 and regular project with No. 61876096), the Guoqiang Institute of Tsinghua University, China (Nos. 2019GQG1 and 2020GQG0005), and Tsinghua-Toyota Joint Research Fund.

Author information

These authors contribute equally to this work

Authors and Affiliations

The Conversational AI Group, Tsinghua University, Bei**g, 100084, China
Yuxian Gu, Jiaxin Wen, Hao Sun, Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, **aoyan Zhu & Minlie Huang
Department of Computer Science and Technology, Tsinghua University, Bei**g, 100084, China
Yuxian Gu, Jiaxin Wen, Hao Sun, Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, Jianzhu Yao, **aoyan Zhu & Minlie Huang
Department of Electrical Engineering and Computer Science, York University, Toronto, M3J1P3, Canada
Lei Liu

Authors

Yuxian Gu
View author publications
You can also search for this author in PubMed Google Scholar
Jiaxin Wen
View author publications
You can also search for this author in PubMed Google Scholar
Hao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yi Song
View author publications
You can also search for this author in PubMed Google Scholar
Pei Ke
View author publications
You can also search for this author in PubMed Google Scholar
Chujie Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jianzhu Yao
View author publications
You can also search for this author in PubMed Google Scholar
Lei Liu
View author publications
You can also search for this author in PubMed Google Scholar
**aoyan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Minlie Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minlie Huang.

Additional information

Yuxian Gu received the B. Eng. degree in computer science and technology from Tsinghua University, China in 2021. Currently, he is a Ph. D. degree candidate in computer science and technology at Department of Computer Scence and Technology, Tsinghua University, China.

His research interests include natural language processing, pre-trained language models, and dialogue systems.

Jiaxin Wen received the B. Eng. degree in computer science and technology from Tsinghua University, China in 2022. He is currently a master student in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include pre-trained language models and dialogue systems.

Hao Sun received the B. Eng. degree in computer science and technology from Shanghai Jiao Tong University, China in 2016. He is a master student in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation and dialogue systems.

Yi Song received the B. Eng. degree in computer science and technology from Bei**g Institute of Technology, China in 2021. He is currently a master student in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation and dialogue systems.

Pei Ke received the Ph. D. degree in computer science and technology from Tsinghua University, China in 2022. He is currently a postdoctoral researcher at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation, dialogue systems and sentiment analysis.

Chujie Zheng received the B. Sc. degree in physics from Tsinghua University, China in 2020. He is a Ph. D. degree candidate in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation and dialogue systems.

Zheng Zhang received the B. Eng. and Ph. D. degrees in computer science and technology from Department of Computer Science and Technology, Tsinghua University, China in 2015 and 2021, respectively. He is now a postdoctoral researcher at Tsinghua University, China.

His research interests include natural language processing, dialogue systems and text generation.

Jianzhu Yao is an undergraduate student in computer science and technology at Department of Computer Science and Technology, Tsinghua University, China.

His research interests include natural language generation and dialogue systems.

Lei Liu received the M. Sc. degree in computer science from Central China Normal University, China in 2019. He is currently a Ph. D. candidate in the Graduate Program of Electrical Engineering and Computer Science, York University, Canada.

His research interests include dialogue systems and natural language generation.

**aoyan Zhu received the B. Sc. degree in computer science and technology from University of Science and Technology Bei**g, China in 1982, the M. Sc. degree in computer science and technology from Kobe University, Japan in 1987, and the Ph. D. degree in computer science and technology from Nagoya Institute of Technology, Japan in 1990. She is currently a professor with Department of Computer Science and Technology, Tsinghua University, China. She has authored more than 100 peer-reviewed articles in leading international conferences (SIGKDD, IJCAI, AAAI, ACL) and journals (TOIS, Bioinformatics, Genome Biology).

Her research interests include intelligent information processing, machine learning, natural language processing, question and answering system and Bioinformatics.

Minlie Huang received his Ph.D. degree in engineering of physics from Tsinghua University, China in 2006. He is currently an associate professor with Department of Computer Science and Technology, Tsinghua University, China. He has published more than 60 papers in premier conferences and journals (ACL, EMNLP, AAAI, IJCAI, WWW, SIGIR, etc.). His work on emotional chatting machines was reported by MIT Technology Review, the Guardian, and many other mass media. He serves as standing reviewer for TACL, area chairs for ACL 2020/2016, EMNLP 2019/2014/2011, and Senior PC members for AAAI 2017–2020 and IJCAI 2017–2020, and reviewers for TASLP, TKDE, TOIS, TPAMI, etc. He is a Nominee of ACL 2019 Best Demo Papers, the Recipient of IJCAI 2018 Distinguished Paper Award, CCL 2018 Best Demo Award, NLPCC 2015 Best Paper Award, Hanvon Youngth Innovation Award in 2018, and Wuwenjun AI Award in 2019. He was supported by a NSFC key project, several NSFC regular projects, and many IT companies.

His research interests include natural language processing, particularly in dialog systems, reading comprehension, and sentiment analysis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gu, Y., Wen, J., Sun, H. et al. EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training. Mach. Intell. Res. 20, 207–219 (2023). https://doi.org/10.1007/s11633-022-1387-3

Download citation

Received: 14 September 2022
Accepted: 24 October 2022
Published: 18 February 2023
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11633-022-1387-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recent advances in deep learning based dialogue systems: a systematic survey

Privacy-Preserving Medical Dialogue Generation Based on Federated Learning

Zero-Shot Deployment for Cross-Lingual Dialogue System

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recent advances in deep learning based dialogue systems: a systematic survey

Privacy-Preserving Medical Dialogue Generation Based on Federated Learning

Zero-Shot Deployment for Cross-Lingual Dialogue System

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation