Test Case Generation for Dialogflow Task-Based Chatbots

📅 2025-03-07

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

This paper addresses the weak test case generation capability of task-oriented Dialogflow chatbots. We propose CTG, an end-to-end semantic-aware test generation method specifically designed for Dialogflow’s task flow semantics. CTG is the first approach to jointly leverage dialogue state machine modeling, intent graph analysis, and controllable path traversal—integrated with intent-entity combinatorial generation and randomized mutation testing. A closed-loop validation mechanism is established via Dialogflow API parsing and execution feedback. Experimental evaluation across seven real-world Dialogflow bots demonstrates that CTG significantly improves test case robustness and defect detection rate, outperforming state-of-the-art tools BOTIUM and CHARM. By grounding test generation in semantic structures rather than surface-level utterances, CTG establishes a scalable, semantics-driven automation paradigm for testing task-oriented dialogue systems.

Technology Category

Application Category

📝 Abstract

Chatbots are software typically embedded in Web and Mobile applications designed to assist the user in a plethora of activities, from chit-chatting to task completion. They enable diverse forms of interactions, like text and voice commands. As any software, even chatbots are susceptible to bugs, and their pervasiveness in our lives, as well as the underlying technological advancements, call for tailored quality assurance techniques. However, test case generation techniques for conversational chatbots are still limited. In this paper, we present Chatbot Test Generator (CTG), an automated testing technique designed for task-based chatbots. We conducted an experiment comparing CTG with state-of-the-art BOTIUM and CHARM tools with seven chatbots, observing that the test cases generated by CTG outperformed the competitors, in terms of robustness and effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Automated test case generation for task-based chatbots.

Addressing limited testing techniques for conversational chatbots.

Improving robustness and effectiveness of chatbot testing.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated test case generation

Task-based chatbot testing

Outperforms BOTIUM and CHARM

🔎 Similar Papers

Enhancing Large Language Models for Text-to-Testcase Generation