Conversation context; Conversation history; Conversation models; Deep neural networks; 'current; Context models; Context-Aware; Contextual modeling; Conversation model; Conversation systems; Encodings; In contexts; Information Systems; Computer Networks and Communications
Abstract :
[en] Conversation modeling is an important and challenging task in the field of natural language processing because it is a key component promoting the development of automated human-machine conversation. Most recent research concerning conversation modeling focuses only on the current utterance (considered as the current question) to generate a response, and thus fails to capture the conversation's logic from its beginning. Some studies concatenate the current question with previous conversation sentences and use it as input for response generation. Another approach is to use an encoder to store all previous utterances. Each time a new question is encountered, the encoder is updated and used to generate the response. Our approach in this paper differs from previous studies in that we explicitly separate the encoding of the question from the encoding of its context. This results in different encoding models for the question and the context, capturing the specificity of each. In this way, we have access to the entire context when generating the response. To this end, we propose a deep neural network-based model, called the Context Model, to encode previous utterances' information and combine it with the current question. This approach satisfies the need for context information while keeping the different roles of the current question and its context separate while generating a response. We investigate two approaches for representing the context: Long short-term memory and Convolutional neural network. Experiments show that our Context Model outperforms a baseline model on both ConvAI2 Dataset and a collected dataset of conversational English.
Disciplines :
Computer science
Author, co-author :
Luong Tran, Quoc-Dai; Natural Language Processing and Knowledge Discovery Laboratory, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh city, Viet Nam
Vu, Dinh-Hong; Natural Language Processing and Knowledge Discovery Laboratory, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh city, Viet Nam
Le, Anh-Cuong; Natural Language Processing and Knowledge Discovery Laboratory, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh city, Viet Nam
Zeng, X., Li, J., Wang, L., Wong, K.-F., "Joint effects of context and user history for predicting online conversation re-entries, " in Proc. of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2809-2818, 2019. Article (CrossRef Link)
Bapna, A., Tür, G., Hakkani- Tür, D., Heck, L., "Sequential dialogue context modeling for spoken language understanding, " in Proc. of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pp. 103-114, 2017. Article (CrossRef Link)
Li, J., Galley, M., Brockett, C., Spithourakis, G., Gao, J., Dolan, B., "A persona-based neural conversation model, " in Proc. of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 994-1003, 2016. Article (CrossRef Link)
Baheti, A., Ritter, A., Li, J., Dolan, B., "Generating more interesting responses in neural conversation models with distributional constraints, " in Proc. of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3970-3980, 2018. Article (CrossRef Link)
Dušek, O., Jurčíček, F., "A context-aware natural language generator for dialogue systems, " in Proc. of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 185-190, 2016. Article (CrossRef Link)
Zhou, H., Huang, M., Zhu, X., "Context-aware natural language generation for spoken dialogue systems, " in Proc. of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2032-2041, 2016. Article (CrossRef Link)
See, A., Roller, S., Kiela, D., Weston, J., "What makes a good conversation?. how controllable attributes affect human judgments, " in Proc. of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1702-1723, 2019. Article (CrossRef Link)
Pengshan Cai, Hui Wan, Fei Liu, Mo Yu, Hong Yu, and Sachindra Joshi, "Learning as Conversation: Dialogue Systems Reinforced for Information Acquisition, " in Proc. of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4781-4796, 2022. Article (CrossRef Link)
Mikolov, T., Karafiát, M., Burget, L., Černocký, J., Khudanpur, S., "Recurrent neural network based language model, " in Proc. of Eleventh Annual Conference of the International Speech Communication Association, 2010.
Serban, I.V., Sordoni, A., Bengio, Y., Courville, A., Pineau, J., "Building end-to-end dialogue systems using generative hierarchical neural network models, " in Proc. of Thirtieth AAAI Conference on Artificial Intelligence, 30(1), 2016. Article (CrossRef Link)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., "Bert: Pre-training of deep bidirectional transformers for language understanding, " in Proc. of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171-4186, 2019. Article (CrossRef Link)
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, et al, "Language models are few-shot learners, " in Proc. of the 34th International Conference on Neural Information Processing Systems (NIPS'20), pp. 1877-1901, 2020, Article 159. Article (CrossRef Link)
Sutskever, I., Vinyals, O., V Le, Q., "Sequence to sequence learning with neural networks, " Advances in Neural Information Processing Systems, vol. 4 (January), pp. 3104-3112, 2014. Article (CrossRef Link)
Papineni, K., Roukos, S., Ward, T., Zhu, W., "Bleu: a method for automatic evaluation of machine translation, " in Proc. of the 40th Annual Meeting of the Association for Computational Linguistics, vol. 371(23), pp. 311-318, 2002. Article (CrossRef Link)
Weizenbaum, J., "Eliza-a computer program for the study of natural language communication between man and machine, " Communications of the ACM, vol. 9(1), pp. 36-45, 1966. Article (CrossRef Link)
Leuski, A., Traum, D., Npceditor, "NPCEditor: Creating virtual human dialogue using information retrieval techniques, " Ai Magazine, vol. 32(2), pp. 42-56, 2011. Article (CrossRef Link)
Ji, Z., Lu, Z., Li, H., "An information retrieval approach to short text conversation, " arXiv preprint arXiv:1408.6988, 2014. Article (CrossRef Link)
Ritter, A., Cherry, C., Dolan, W.B., "Data-driven response generation in social media, " in Proc. of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 583-593, 2011. Article (CrossRef Link)
Jafarpour, S., Burges, C.J., Ritter, A., "Filter, rank, and transfer the knowledge: Learning to chat, " Advances in Ranking, 10-15, 2009. Article (CrossRef Link)
Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., Weston, J., "Personalizing dialogue agents: I have a dog, do you have pets too?, " in Proc. of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2204-2213, 2018. Article (CrossRef Link)
Ghazvininejad, M., Brockett, C., Chang, M.-W., Dolan, B., Gao, J., Yih, W.-t., Galley, M., "A knowledge-grounded neural conversation model, " in Proc. of Thirty-Second AAAI Conference on Artificial Intelligence, 32(1), 2018. Article (CrossRef Link)
Caldarini, Guendalina, Sardar Jaf, and Kenneth McGarry, "A Literature Survey of Recent Advances in Chatbots, " Information, vol. 13, no. 1, 2022. Article (CrossRef Link)
Yu, L., Zhang, W., Wang, J., Yu, Y., "Seqgan: Sequence generative adversarial nets with policy gradient, " in Proc. of the AAAI Conference on Artificial Intelligence, vol. 31, pp. 2852-2858, 2017. Article (CrossRef Link)
Kawano, S., Yoshino, K., Nakamura, S., "Neural conversation model controllable by given dialogue act based on adversarial learning and label-aware objective, " in Proc. of the 12th International Conference on Natural Language Generation, pp. 198-207, 2019. Article (CrossRef Link)
Uc-Cetina, V., Navarro-Guerrero, N., Martin-Gonzalez, A. et al, "Survey on reinforcement learning for language processing, " Artificial Intelligence Review, vol. 56, pp. 1543-1575, 2023. Article (CrossRef Link)
Yu-Ling Hsueh and Tai-Liang Chou, "A Task-oriented Chatbot Based on LSTM and Reinforcement Learning, " in ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 22, pp. 1-27, 2022. Article (CrossRef Link)
Derek Chen, Howard Chen, Yi Yang, Alexander Lin, and Zhou Yu, "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems, " in Proc. of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3002-3017, 2021. Article (CrossRef Link)
Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B., "A diversity-promoting objective function for neural conversation models, " in Proc. of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 110-119, San Diego, California, 2016. Article (CrossRef Link)
De Coster, Mathieu, and Joni Dambre, "Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation, " Information, vol. 13, no. 5, p. 220, 2022. Article (CrossRef Link)
Yan, Rong, Jiang Li, Xiangdong Su, Xiaoming Wang, and Guanglai Gao, "Boosting the Transformer with the BERT Supervision in Low-Resource Machine Translation, " Applied Sciences, 12, no. 14, p. 7195, 2022. Article (CrossRef Link)
Eldar Kurtic, Daniel Campos, Tuan Nguyen, Elias Frantar, Mark Kurtz, Benjamin Fineran, Michael Goin, and Dan Alistarh, "The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models, " in Proc. of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 4163-4181, 2022. Article (CrossRef Link)
Tianshu Shen, Jiaru Li, Mohamed Reda Bouadjenek, Zheda Mai, Scott Sanner, "Towards understanding and mitigating unintended biases in language model-driven conversational recommendation, " Information Processing & Management, Vol. 60, no. 1, 2023. Article (CrossRef Link)
Yogesh K. Dwivedi, Nir Kshetri, Laurie Hughes, Emma Louise Slade, Anand Jeyara, et al, "So what if ChatGPT wrote it?. Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy, " International Journal of Information Management, Vol. 71, 2023. Article (CrossRef Link)
Wang, Y., Li, J., King, I., Lyu, M.R., Shi, S., "Microblog hashtag generation via encoding conversation contexts, " in Proc. of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1624-1633, 2019. Article (CrossRef Link)
Tan, M., Wang, D., Gao, Y., Wang, H., Potdar, S., Guo, X., Chang, S., Yu, M., "Context-aware conversation thread detection in multi-party chat, " in Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 6456-6461, 2019. Article (CrossRef Link)
Adamopoulou, E., Moussiades, L., "Chatbots: History, technology, and applications, " Machine Learning with Applications, vol. 2, p. 100006, 2020. Article (CrossRef Link)
Bahdanau, D., Cho, K., Bengio, Y., "Neural machine translation by jointly learning to align and translate, " in Proc. of 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, pp. 1-15, 2016. Article (CrossRef Link)
Tian, Z., Yan, R., Mou, L., Song, Y., Feng, Y., Zhao, D., "How to make context more useful?. an empirical study on context-aware neural conversational models, " in Proc. of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 231-236, 2017. Article (CrossRef Link)
Xu, F., Xu, G., Wang, Y., Wang, R., Ding, Q., Liu, P., Zhu, Z. "Diverse dialogue generation by fusing mutual persona-aware and self-transferrer, " Applied Intelligence, vol. 52, pp. 4744-4757, 2022. Article (CrossRef Link)
Dinan, E., Logacheva, V., Malykh, V., Miller, A., Shuster, K., Urbanek, J., Kiela, D., Szlam, A., Serban, I., Lowe, R., et al., "The second conversational intelligence challenge (convai2), " The NeurIPS '18 Competition, pp. 187-208, 2019. Article (CrossRef Link)
Logacheva, V., Malykh, V., Litinsky, A., Burtsev, M., "Convai2 dataset of non-goal-oriented human-to-bot dialogues, " The NeurIPS '18 Competition, pp. 277-294, 2020. Article (CrossRef Link)
Sordoni, A., Galley, M., Auli, M., Brockett, C., Ji, Y., Mitchell, M., Nie, J.-Y., Gao, J., Dolan, B., "A neural network approach to context-sensitive generation of conversational responses, " in Proc. of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 196-205, 2015. Article (CrossRef Link)
Galley, M., Brockett, C., Sordoni, A., Ji, Y., Auli, M., Quirk, C., Mitchell, M., Gao, J., Dolan, B., "deltableu: A discriminative metric for generation tasks with intrinsically diverse targets, " in Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 445-450, 2015. Article (CrossRef Link)