CARE: Privacy-Compliant Agentic Reasoning with Evidence Discordance

📅 2026-04-01

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work addresses the degraded decision-making performance of large language models when confronted with internal inconsistencies in clinical evidence—such as contradictions between symptoms and physical signs—and proposes the CARE framework. CARE introduces a privacy-compliant dual-model architecture: a remote model generates structured reasoning paths, while a local model integrates conflicting evidence without direct access to raw sensitive data. By leveraging multi-agent reasoning, structured state-transition modeling, and privacy-preserving isolation, CARE substantially outperforms existing baselines on the MIMIC-DOS dataset for predicting ICU organ dysfunction deterioration, demonstrating markedly enhanced robustness in handling conflicting clinical evidence across key evaluation metrics.

Technology Category

Application Category

📝 Abstract

Large language model (LLM) systems are increasingly used to support high-stakes decision-making, but they typically perform worse when the available evidence is internally inconsistent. Such a scenario exists in real-world healthcare settings, with patient-reported symptoms contradicting medical signs. To study this problem, we introduce MIMIC-DOS, a dataset for short-horizon organ dysfunction worsening prediction in the intensive care unit (ICU) setting. We derive this dataset from the widely recognized MIMIC-IV, a publicly available electronic health record dataset, and construct it exclusively from cases in which discordance between signs and symptoms exists. This setting poses a substantial challenge for existing LLM-based approaches, with single-pass LLMs and agentic pipelines often struggling to reconcile such conflicting signals. To address this problem, we propose CARE: a multi-stage privacy-compliant agentic reasoning framework in which a remote LLM provides guidance by generating structured categories and transitions without accessing sensitive patient data, while a local LLM uses these categories and transitions to support evidence acquisition and final decision-making. Empirically, CARE achieves stronger performance across all key metrics compared to multiple baseline settings, showing that CARE can more robustly handle conflicting clinical evidence while preserving privacy.

Problem

Research questions and friction points this paper is trying to address.

evidence discordance

large language models

clinical decision-making

conflicting evidence

privacy-compliant reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy-compliant reasoning

evidence discordance

agentic LLM framework