🤖 AI Summary
This work addresses a key limitation in existing large language model (LLM)-driven clinical diagnosis systems, which typically assume fully observable patient information and struggle to model sequential diagnostic reasoning under uncertainty. The authors formalize this challenge as a latent diagnostic trajectory learning problem and propose a dual-LLM architecture: a diagnostic agent infers the posterior distribution over possible diseases, while a planning agent generates a sequence of diagnostic actions that are both informative and progressively reduce uncertainty, guided by this posterior. The core innovation lies in a trajectory-level posterior alignment mechanism that leverages uncertainty awareness to enhance the coherence and efficiency of the diagnostic pathway. Evaluated on the MIMIC-CDM benchmark, the proposed method significantly outperforms current approaches, achieving higher diagnostic accuracy with fewer required tests.
📝 Abstract
Clinical diagnosis requires sequential evidence acquisition under uncertainty. However, most Large Language Model (LLM) based diagnostic systems assume fully observed patient information and therefore do not explicitly model how clinical evidence should be sequentially acquired over time. Even when diagnosis is formulated as a sequential decision process, it is still challenging to learn effective diagnostic trajectories. This is because the space of possible evidence-acquisition paths is relatively large, while clinical datasets rarely provide explicit supervision information for desirable diagnostic paths. To this end, we formulate sequential diagnosis as a Latent Diagnostic Trajectory Learning (LDTL) framework based on a planning LLM agent and a diagnostic LLM agent. For the diagnostic LLM agent, diagnostic action sequences are treated as latent paths and we introduce a posterior distribution that prioritizes trajectories providing more diagnostic information. The planning LLM agent is then trained to follow this distribution, encouraging coherent diagnostic trajectories that progressively reduce uncertainty. Experiments on the MIMIC-CDM benchmark demonstrate that our proposed LDTL framework outperforms existing baselines in diagnostic accuracy under a sequential clinical diagnosis setting, while requiring fewer diagnostic tests. Furthermore, ablation studies highlight the critical role of trajectory-level posterior alignment in achieving these improvements.