CARE: Privacy-Compliant Agentic Reasoning with Evidence Discordance

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the degraded decision-making performance of large language models when confronted with internal inconsistencies in clinical evidence—such as contradictions between symptoms and physical signs—and proposes the CARE framework. CARE introduces a privacy-compliant dual-model architecture: a remote model generates structured reasoning paths, while a local model integrates conflicting evidence without direct access to raw sensitive data. By leveraging multi-agent reasoning, structured state-transition modeling, and privacy-preserving isolation, CARE substantially outperforms existing baselines on the MIMIC-DOS dataset for predicting ICU organ dysfunction deterioration, demonstrating markedly enhanced robustness in handling conflicting clinical evidence across key evaluation metrics.
📝 Abstract
Large language model (LLM) systems are increasingly used to support high-stakes decision-making, but they typically perform worse when the available evidence is internally inconsistent. Such a scenario exists in real-world healthcare settings, with patient-reported symptoms contradicting medical signs. To study this problem, we introduce MIMIC-DOS, a dataset for short-horizon organ dysfunction worsening prediction in the intensive care unit (ICU) setting. We derive this dataset from the widely recognized MIMIC-IV, a publicly available electronic health record dataset, and construct it exclusively from cases in which discordance between signs and symptoms exists. This setting poses a substantial challenge for existing LLM-based approaches, with single-pass LLMs and agentic pipelines often struggling to reconcile such conflicting signals. To address this problem, we propose CARE: a multi-stage privacy-compliant agentic reasoning framework in which a remote LLM provides guidance by generating structured categories and transitions without accessing sensitive patient data, while a local LLM uses these categories and transitions to support evidence acquisition and final decision-making. Empirically, CARE achieves stronger performance across all key metrics compared to multiple baseline settings, showing that CARE can more robustly handle conflicting clinical evidence while preserving privacy.
Problem

Research questions and friction points this paper is trying to address.

evidence discordance
large language models
clinical decision-making
conflicting evidence
privacy-compliant reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy-compliant reasoning
evidence discordance
agentic LLM framework
MIMIC-DOS
conflicting clinical evidence
🔎 Similar Papers
No similar papers found.
H
Haochen Liu
University of Cambridge
W
Weien Li
McGill University
R
Rui Song
McGill University
Z
Zeyu Li
McGill University
Chun Jason Xue
Chun Jason Xue
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)
Systems and Storage
Xiao-Yang Liu
Xiao-Yang Liu
Columbia University
TensorDeep LearningReinforcement LearningBig Data
S
Sam Nallaperuma
University of Cambridge
X
Xue Liu
McGill University, MBZUAI - Mohamed bin Zayed University of Artificial Intelligence, Mila - Quebec AI Institute
Ye Yuan
Ye Yuan
McGill University, Mila - Quebec AI Institute
Generative ModelingBlack Box OptimizationKnowledge-Centric NLPLLMs