HaGAN: Hierarchical Attentive Adversarial Learning for Task-Oriented Dialogue System

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11953))

Included in the following conference series:

  • 2882 Accesses

Abstract

Task-oriented dialogue system is commonly formulated as a reinforcement learning problem. A reward served as a learning objective is offered at the end of the generated dialogue to help optimize the system. As fulfilling a specific task often takes many turns between the system and the user, a scalar reward signal after this long process can be delayed and sparse. To address the above problems in the reinforcement learning (RL) based task-completion system, we propose a novel hierarchical attentive adversarial network HaGAN which features a cascaded attentive generator CAG that explores a state-action space to generate a dialogue and global-local attentive discriminators GLAD to give a relevant reward at multi-scale dialogue states. Specifically, after every turn of the dialogue generation, the turn-based discriminator tests the current turn and give a local reward representing the generator’s current generating ability. When the dialogue finishes, the dialogue-based discriminator gives a global reward concerns the whole dialog. Finally, a synthesized reward computed by combining global and local reward is returned to the generator. By doing so, the generator is able to generate globally and locally fluent and informative dialogues. Through experiments on two public benchmark datasets demonstrate the superiority of our HaGAN over other representative state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 42.79
Price includes VAT (France)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 52.74
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mnih, V., et al.: Playing atari with deep reinforcement learning. ar**v preprint ar**v:1312.5602 (2013)

  2. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)

    Article  Google Scholar 

  3. Young, S., Gašić, M., Thomson, B., Williams, J.D.: POMDP-based statistical spoken dialog systems: a review. Proc. IEEE 101(5), 1160–1179 (2013)

    Article  Google Scholar 

  4. Fatemi, M., Asri, L.E., Schulz, H., He, J., Suleman, K.: Policy networks with two-stage training for dialogue systems. ar**v preprint ar**v:1606.03152 (2016)

  5. Su, P.H., et al.: Continuously learning neural dialogue management. ar**v preprint ar**v:1606.02689 (2016)

  6. Williams, J.D., Asadi, K., Zweig, G.: Hybrid code networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning. ar**v preprint ar**v:1702.03274 (2017)

  7. Dhingra, B., et al.: Towards end-to-end reinforcement learning of dialogue agents for information access. ar**v preprint ar**v:1609.00777 (2016)

  8. Lei, W., **, X., Kan, M.Y., Ren, Z., He, X., Yin, D.: Sequicity: simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1437–1447 (2018)

    Google Scholar 

  9. Lipton, Z., Li, X., Gao, J., Li, L., Ahmed, F., Deng, L.: BBQ-networks: efficient exploration in deep reinforcement learning for task-oriented dialogue systems. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  10. Su, P.H., et al.: On-line active reward learning for policy optimisation in spoken dialogue systems. ar**v preprint ar**v:1605.07669 (2016)

  11. Liu, B., Lane, I.: Adversarial learning of task-oriented neural dialog models. ar**v preprint ar**v:1805.11762 (2018)

  12. Peng, B., Li, X., Gao, J., Liu, J., Chen, Y.N., Wong, K.F.: Adversarial advantage actor-critic model for task-completion dialogue policy learning. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6149–6153. IEEE (2018)

    Google Scholar 

  13. Sutton, R.S., Barto, A.G., et al.: Introduction to Reinforcement Learning, vol. 135. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  14. Williams, J.D., Young, S.: Partially observable markov decision processes for spoken dialog systems. Comput. Speech Lang. 21(2), 393–422 (2007)

    Article  Google Scholar 

  15. Silva, J., Coheur, L., Mendes, A.C., Wichert, A.: From symbolic to sub-symbolic information in question classification. Artif. Intell. Rev. 35(2), 137–154 (2011)

    Article  Google Scholar 

  16. Liu, B., Lane, I.: Iterative policy learning in end-to-end trainable task-oriented neural dialog models. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 482–489. IEEE (2017)

    Google Scholar 

  17. Sharma, S., He, J., Suleman, K., Schulz, H., Bachman, P.: Natural language generation in dialogue using lexicalized and delexicalized data. ar**v preprint ar**v:1606.03632 (2016)

  18. Wu, C.S., Madotto, A., Winata, G.I., Fung, P.: End-to-end dynamic query memory network for entity-value independent task-oriented dialog. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6154–6158. IEEE (2018)

    Google Scholar 

  19. Liu, F., Perez, J.: Gated end-to-end memory networks. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pp. 1–10 (2017)

    Google Scholar 

  20. Li, J., Monroe, W., Shi, T., Jean, S., Ritter, A., Jurafsky, D.: Adversarial learning for neural dialogue generation. ar**v preprint ar**v:1701.06547 (2017)

  21. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)

    Article  Google Scholar 

  22. Wen, T.H., et al.: A network-based end-to-end trainable task-oriented dialogue system. ar**v preprint ar**v:1604.04562 (2016)

  23. Henderson, M., Thomson, B., Young, S.: Deep neural network approach for the dialog state tracking challenge. In: Proceedings of the SIGDIAL 2013 Conference, pp. 467–471 (2013)

    Google Scholar 

  24. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  25. Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)

    Google Scholar 

  26. Wen, T.H., Miao, Y., Blunsom, P., Young, S.: Latent intention dialogue models. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 3732–3741. JMLR. org (2017)

    Google Scholar 

  27. Eric, M., Manning, C.D.: Key-value retrieval networks for task-oriented dialogue. ar**v preprint ar**v:1705.05414 (2017)

  28. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)

    Google Scholar 

  29. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ar**v preprint ar**v:1412.6980 (2014)

  30. Guo, J., Lu, S., Cai, H., Zhang, W., Yu, Y., Wang, J.: Long text generation via adversarial training with leaked information. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  31. Donahue, D., Rumshisky, A.: Adversarial text generation without reinforcement learning. ar**v preprint ar**v:1810.06640 (2018)

  32. Subramanian, S., Mudumba, S.R., Sordoni, A., Trischler, A., Courville, A.C., Pal, C.: Towards text generation with adversarially learned neural outlines. In: Advances in Neural Information Processing Systems, pp. 7551–7563 (2018)

    Google Scholar 

  33. Bau, D., et al.: GAN dissection: visualizing and understanding generative adversarial networks. ar**v preprint ar**v:1811.10597 (2018)

Download references

Acknowledgements

This work is supported in part by Chinese National Double First-rate Project about digital protection of cultural relics in Grotto Temple and equipment upgrading of the Chinese National Cultural Heritage Administration scientific research institutes.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Duanqing Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fang, T., Qiao, T., Xu, D. (2019). HaGAN: Hierarchical Attentive Adversarial Learning for Task-Oriented Dialogue System. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Lecture Notes in Computer Science(), vol 11953. Springer, Cham. https://doi.org/10.1007/978-3-030-36708-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36708-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36707-7

  • Online ISBN: 978-3-030-36708-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation