Expert-guided Clinical Text Augmentation via Query-Based Model Collaboration

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

150K/year

🤖 AI Summary

To address the risk of clinically erroneous synthetic data generated by large language models (LLMs) in high-stakes medical applications, this paper proposes Query-based Model Collaboration Framework (Q-MCF), an expert-guided, query-driven framework that dynamically injects domain-expert knowledge into the LLM generation process to ensure factual accuracy of critical medical information. Q-MCF employs a lightweight collaborative mechanism that embeds structured expert queries directly into the LLM’s inference pipeline, jointly optimizing data quality and clinical safety. Experiments across multiple clinical prediction tasks demonstrate that Q-MCF significantly reduces factual error rates (average reduction of 32.7%) and enhances downstream model robustness and generalization—outperforming state-of-the-art data augmentation methods. To our knowledge, this is the first work to systematically integrate structured expert querying into the LLM-based data augmentation pipeline, establishing a novel, interpretable, and verifiable paradigm for trustworthy medical AI.

Technology Category

Application Category

📝 Abstract

Data augmentation is a widely used strategy to improve model robustness and generalization by enriching training datasets with synthetic examples. While large language models (LLMs) have demonstrated strong generative capabilities for this purpose, their applications in high-stakes domains like healthcare present unique challenges due to the risk of generating clinically incorrect or misleading information. In this work, we propose a novel query-based model collaboration framework that integrates expert-level domain knowledge to guide the augmentation process to preserve critical medical information. Experiments on clinical prediction tasks demonstrate that our lightweight collaboration-based approach consistently outperforms existing LLM augmentation methods while improving safety through reduced factual errors. This framework addresses the gap between LLM augmentation potential and the safety requirements of specialized domains.

Problem

Research questions and friction points this paper is trying to address.

Augmenting clinical text data safely using expert guidance

Reducing factual errors in LLM-generated medical content

Bridging domain safety gaps with collaborative model framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Query-based model collaboration for clinical text augmentation

Integrates expert knowledge to preserve medical information

Lightweight framework reduces factual errors in LLM generation

🔎 Similar Papers

Error Correction in Radiology Reports: A Knowledge Distillation-Based Multi-Stage Framework