An Agentic AI System for Multi-Framework Communication Coding

📅 2025-12-09

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

High-quality coding of clinical doctor–patient dialogues faces challenges including high manual annotation costs, poor inter-annotator consistency, and limited cross-domain generalizability. To address these, we propose MOSAIC, a LangGraph-based multi-agent system featuring a novel four-agent collaborative architecture—codebook selection, dynamic codebook updating, label generation, and consistency verification—integrated with retrieval-augmented generation (RAG), dynamic few-shot prompting, and multi-agent coordination. MOSAIC enables automated, interpretable, and adaptive dialogue coding across multiple clinical communication frameworks. Evaluated on 50 real-world clinician–patient transcripts, it achieves an overall F1-score of 0.928; on the rheumatology subset, F1 reaches 0.962, with particularly strong performance in patient behavior identification. This work significantly enhances the scalability, reliability, and cross-domain applicability of clinical communication analysis.

Technology Category

Application Category

📝 Abstract

Clinical communication is central to patient outcomes, yet large-scale human annotation of patient-provider conversation remains labor-intensive, inconsistent, and difficult to scale. Existing approaches based on large language models typically rely on single-task models that lack adaptability, interpretability, and reliability, especially when applied across various communication frameworks and clinical domains. In this study, we developed a Multi-framework Structured Agentic AI system for Clinical Communication (MOSAIC), built on a LangGraph-based architecture that orchestrates four core agents, including a Plan Agent for codebook selection and workflow planning, an Update Agent for maintaining up-to-date retrieval databases, a set of Annotation Agents that applies codebook-guided retrieval-augmented generation (RAG) with dynamic few-shot prompting, and a Verification Agent that provides consistency checks and feedback. To evaluate performance, we compared MOSAIC outputs against gold-standard annotations created by trained human coders. We developed and evaluated MOSAIC using 26 gold standard annotated transcripts for training and 50 transcripts for testing, spanning rheumatology and OB/GYN domains. On the test set, MOSAIC achieved an overall F1 score of 0.928. Performance was highest in the Rheumatology subset (F1 = 0.962) and strongest for Patient Behavior (e.g., patients asking questions, expressing preferences, or showing assertiveness). Ablations revealed that MOSAIC outperforms baseline benchmarking.

Problem

Research questions and friction points this paper is trying to address.

Automates clinical communication coding across multiple frameworks

Addresses labor-intensive and inconsistent human annotation challenges

Enhances adaptability and reliability in diverse clinical domains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LangGraph architecture orchestrates four specialized agents

Codebook-guided RAG with dynamic few-shot prompting for annotation

Verification agent ensures consistency and provides feedback loops

🔎 Similar Papers

Generative AI-Enhanced Multi-Modal Semantic Communication in Internet of Vehicles: System Design and Methodologies