Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted Approach for Qualitative Data Analysis

📅 2024-02-02

🏛️ arXiv.org

📈 Citations: 20

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Qualitative data analysis in software engineering faces challenges including time intensity, poor reproducibility, and difficulty ensuring inter-rater reliability; the potential of large language models (LLMs) for human–AI collaboration in such tasks remains underexplored. This paper introduces the first explainable multi-agent framework tailored for qualitative research, enabling automated coding, theme extraction, and cross-textual synthesis via role-based task decomposition, prompt engineering, iterative validation, and a closed-loop human feedback mechanism. The architecture preserves human oversight and ensures analytical traceability, overcoming LLM limitations in low-shot, high-reliability settings. Empirical evaluation demonstrates a 3.2× improvement in analysis efficiency, scalability to hundreds of interviews, 89.7% accuracy in theme identification, and strong endorsement by domain experts.

Technology Category

Application Category

📝 Abstract

Recent advancements in Large Language Models (LLMs) have enabled collaborative human-bot interactions in Software Engineering (SE), similar to many other professions. However, the potential benefits and implications of incorporating LLMs into qualitative data analysis in SE have not been completely explored. For instance, conducting qualitative data analysis manually can be a time-consuming, effort-intensive, and error-prone task for researchers. LLM-based solutions, such as generative AI models trained on massive datasets, can be utilized to automate tasks in software development as well as in qualitative data analysis. To this end, we utilized LLMs to automate and expedite the qualitative data analysis processes. We employed a multi-agent model, where each agent was tasked with executing distinct, individual research related activities. Our proposed model interpreted large quantities of textual documents and interview transcripts to perform several common tasks used in qualitative analysis. The results show that this technical assistant speeds up significantly the data analysis process, enabling researchers to manage larger datasets much more effectively. Furthermore, this approach introduces a new dimension of scalability and accuracy in qualitative research, potentially transforming data interpretation methodologies in SE.

Problem

Research questions and friction points this paper is trying to address.

Automating qualitative data analysis using LLM-based multi-agent systems

Addressing time-intensive manual analysis that compromises validity

Developing AI-human collaboration for qualitative research automation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent system automates qualitative data analysis

LLM agents handle text summarization and coding

Synergizes human decision support with AI

🔎 Similar Papers

Multi-Agent Causal Discovery Using Large Language Models