Agent-as-Peer-Debriefer: A Multi-Agent Framework with Perspective-Based Refinement for Qualitative Analysis

๐Ÿ“… 2026-05-23
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study addresses the limited depth of large language models (LLMs) in qualitative data analysis, primarily due to the absence of trustworthiness safeguards such as peer debriefing. The authors propose the first framework formalizing peer debriefing as a multi-agent collaboration: a hierarchical coding agent produces initial codes and reflective memos, while three peer agentsโ€”each embodying a distinct analytical perspective (theory-driven, data-driven, and application-oriented)โ€”refine the codes through perspective-informed operations (retaining, renaming, reassigning, merging, or splitting). Experiments across three datasets and two domains demonstrate that this approach significantly outperforms single-LLM baselines, yielding codes closer to human annotations. Ablation studies confirm that performance gains stem from the perspectival peer debriefing mechanism itself, with each perspective exhibiting interpretable trade-offs. This work pioneers analytical perspective as a controllable design dimension, enhancing both credibility and interpretability in LLM-assisted qualitative analysis.
๐Ÿ“ Abstract
Large language models (LLMs) are increasingly used for qualitative data analysis (QDA), yet their outputs often miss the depth and nuance of human analysis. We argue this gap reflects a missing credibility practice from human QDA: peer debriefing, in which an analyst seeks feedback from a disinterested peer and uses it to refine their coding. To bring this practice into LLM-assisted QDA, we propose Agent-as-Peer-Debriefer, a multi-agent QDA framework that builds peer debriefing into key coding steps. In our framework, a Hierarchical Coding Agent follows the standard QDA process to generate codes, sub-themes, and themes, along with self-explanations and reflection memos. It then shares these outputs with three Peer-Debriefing Agents, each applying a distinct analytical perspective (Theory-Driven, Data-Driven, or Applied) and refining the codes by keeping, renaming, reassigning, merging, or splitting them. These perspectives are drawn from established human QDA practices that generalize across domains and datasets. To evaluate the framework, we test it on three datasets across two domains with three LLMs, measuring semantic similarity to human-annotated codes. Across all settings, perspective-based, peer-debriefing refinement aligns more closely with human codes than a single-LLM baseline, and an ablation further shows the gain is not merely from additional refinement. The three perspectives also produce distinct trade-offs, showing that the choice of perspective is a meaningful and controllable design decision. More broadly, these findings suggest that simulating peer debriefing with explicit perspectives is a promising route to more credible LLM-assisted QDA.
Problem

Research questions and friction points this paper is trying to address.

qualitative data analysis
large language models
peer debriefing
credibility
coding
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent framework
peer debriefing
perspective-based refinement
qualitative data analysis
large language models