🤖 AI Summary
Qualitative data analysis in software engineering faces challenges including time intensity, poor reproducibility, and difficulty ensuring inter-rater reliability; the potential of large language models (LLMs) for human–AI collaboration in such tasks remains underexplored. This paper introduces the first explainable multi-agent framework tailored for qualitative research, enabling automated coding, theme extraction, and cross-textual synthesis via role-based task decomposition, prompt engineering, iterative validation, and a closed-loop human feedback mechanism. The architecture preserves human oversight and ensures analytical traceability, overcoming LLM limitations in low-shot, high-reliability settings. Empirical evaluation demonstrates a 3.2× improvement in analysis efficiency, scalability to hundreds of interviews, 89.7% accuracy in theme identification, and strong endorsement by domain experts.
📝 Abstract
Recent advancements in Large Language Models (LLMs) have enabled collaborative human-bot interactions in Software Engineering (SE), similar to many other professions. However, the potential benefits and implications of incorporating LLMs into qualitative data analysis in SE have not been completely explored. For instance, conducting qualitative data analysis manually can be a time-consuming, effort-intensive, and error-prone task for researchers. LLM-based solutions, such as generative AI models trained on massive datasets, can be utilized to automate tasks in software development as well as in qualitative data analysis. To this end, we utilized LLMs to automate and expedite the qualitative data analysis processes. We employed a multi-agent model, where each agent was tasked with executing distinct, individual research related activities. Our proposed model interpreted large quantities of textual documents and interview transcripts to perform several common tasks used in qualitative analysis. The results show that this technical assistant speeds up significantly the data analysis process, enabling researchers to manage larger datasets much more effectively. Furthermore, this approach introduces a new dimension of scalability and accuracy in qualitative research, potentially transforming data interpretation methodologies in SE.