MedConceal: A Benchmark for Clinical Hidden-Concern Reasoning Under Partial Observability

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
This study addresses the challenge of identifying and responding to patients’ concealed concerns or misconceptions in medical dialogues, which often lead to information asymmetry and are difficult for existing systems to handle under partial observability. To this end, the authors introduce MedConceal, an interactive benchmark comprising 300 clinical cases and 600 physician–patient dialogue turns. MedConceal incorporates a clinically grounded taxonomy of hidden concerns and a process-aware evaluation framework that distinguishes between “revelation” and “intervention” capabilities. Clinical validity is ensured through expert-defined categorization, an interactive patient simulator, and turn-level tracking of communicative signals. Experiments reveal substantial variability among state-of-the-art large language models in revelation performance, while human physicians (N=159) significantly outperform them in intervention success, underscoring that reasoning about implicit concerns remains a core challenge in medical dialogue systems.

Technology Category

Application Category

📝 Abstract
Patient-clinician communication is an asymmetric-information problem: patients often do not disclose fears, misconceptions, or practical barriers unless clinicians elicit them skillfully. Effective medical dialogue therefore requires reasoning under partial observability: clinicians must elicit latent concerns, confirm them through interaction, and respond in ways that guide patients toward appropriate care. However, existing medical dialogue benchmarks largely sidestep this challenge by exposing hidden patient state, collapsing elicitation into extraction, or evaluating responses without modeling what remains hidden. We present MedConceal, a benchmark with an interactive patient simulator for evaluating hidden-concern reasoning in medical dialogue, comprising 300 curated cases and 600 clinician-LLM interactions. Built from clinician-answered online health discussions, each case pairing clinician-visible context with simulator-internal hidden concerns derived from prior literature and structured using an expert-developed taxonomy. The simulator withholds these concerns from the dialogue agent, tracks whether they have been revealed and addressed via theory-grounded turn-level communication signals, and is clinician-reviewed for clinical plausibility. This enables process-aware evaluation of both task success and the interaction process that leads to it. We study two abilities: confirmation, surfacing hidden concerns through multi-turn dialogue, and intervention, addressing the primary concern and guiding the patient toward a target plan. Results show that no single system dominates: frontier models lead on different confirmation metrics, while human clinicians (N=159) remain strongest on intervention success. Together, these results identify hidden-concern reasoning under partial observability as a key unresolved challenge for medical dialogue systems.
Problem

Research questions and friction points this paper is trying to address.

partial observability
hidden concerns
medical dialogue
asymmetric information
clinical communication
Innovation

Methods, ideas, or system contributions that make the work stand out.

partial observability
hidden-concern reasoning
interactive patient simulator
medical dialogue benchmark
process-aware evaluation