Adaptive Cluster Collaborativeness Boosts LLMs Medical Decision Support Capacity

📅 2025-07-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing large language model (LLM) ensembles for clinical decision support are typically static and preconfigured, lacking adaptive component selection mechanisms—leading to insufficient model diversity, poor output consistency, and reliance on manual curation or domain-expert verification. Method: We propose a training-free adaptive collaboration framework that dynamically selects high-diversity, high-consistency submodels via two novel mechanisms: “self-diversity maximization” (leveraging fuzzy matching) and “cross-consistency optimization” (integrating cross-evaluation with progressive masking), enabling real-time ensemble reconfiguration and collaborative inference. Results: Our approach significantly outperforms GPT-4 on NEJMQA and MMLU-Pro-health: it achieves 65.47% accuracy in obstetrics and gynecology and is the first method to meet official passing thresholds across all medical specialties. This work establishes a new paradigm for trustworthy, multi-LLM clinical collaboration.

Technology Category

Application Category

📝 Abstract
The collaborativeness of large language models (LLMs) has proven effective in natural language processing systems, holding considerable promise for healthcare development. However, it lacks explicit component selection rules, necessitating human intervention or clinical-specific validation. Moreover, existing architectures heavily rely on a predefined LLM cluster, where partial LLMs underperform in medical decision support scenarios, invalidating the collaborativeness of LLMs. To this end, we propose an adaptive cluster collaborativeness methodology involving self-diversity and cross-consistency maximization mechanisms to boost LLMs medical decision support capacity. For the self-diversity, we calculate the fuzzy matching value of pairwise outputs within an LLM as its self-diversity value, subsequently prioritizing LLMs with high self-diversity values as cluster components in a training-free manner. For the cross-consistency, we first measure cross-consistency values between the LLM with the highest self-diversity value and others, and then gradually mask out the LLM having the lowest cross-consistency value to eliminate the potential inconsistent output during the collaborative propagation. Extensive experiments on two specialized medical datasets, NEJMQA and MMLU-Pro-health, demonstrate the effectiveness of our method across physician-oriented specialties. For example, on NEJMQA, our method achieves the accuracy rate up to the publicly official passing score across all disciplines, especially achieving ACC of 65.47% compared to the 56.12% achieved by GPT-4 on the Obstetrics and Gynecology discipline.
Problem

Research questions and friction points this paper is trying to address.

Lack explicit component selection rules in LLM collaborativeness for healthcare
Predefined LLM clusters underperform in medical decision support scenarios
Need adaptive methodology to boost LLMs medical decision support capacity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive cluster collaborativeness methodology for LLMs
Self-diversity and cross-consistency maximization mechanisms
Training-free prioritization of high self-diversity LLMs
🔎 Similar Papers
No similar papers found.