🤖 AI Summary
Large language models (LLMs) excel at commonsense reasoning but struggle with personalized causal decision-making—such as individualized nutritional interventions—that requires integrating multidimensional personal data. To address this limitation, we propose the first LLM-oriented framework for personalized causal graph reasoning: it models diet–glucose dynamics via individual causal graphs, enabling context-sensitive counterfactual reasoning by an LLM agent, and introduces an LLM-as-a-judge evaluation mechanism to quantify the causal effect of dietary recommendations on glycemic control (measured by incremental area under the curve, iAUC). Our method synergizes structural causal modeling, interpretable agent architecture, and counterfactual evaluation, substantially enhancing recommendation personalization and decision reliability. Experiments demonstrate that our framework outperforms baselines across three time-window iAUC metrics and achieves significant gains in personalization scores, establishing a novel paradigm for deploying LLMs in high-fidelity, causally grounded health decision support.
📝 Abstract
Large Language Models (LLMs) effectively leverage common-sense knowledge for general reasoning, yet they struggle with personalized reasoning when tasked with interpreting multifactor personal data. This limitation restricts their applicability in domains that require context-aware decision-making tailored to individuals. This paper introduces Personalized Causal Graph Reasoning as an agentic framework that enhances LLM reasoning by incorporating personal causal graphs derived from data of individuals. These graphs provide a foundation that guides the LLM's reasoning process. We evaluate it on a case study on nutrient-oriented dietary recommendations, which requires personal reasoning due to the implicit unique dietary effects. We propose a counterfactual evaluation to estimate the efficiency of LLM-recommended foods for glucose management. Results demonstrate that the proposed method efficiently provides personalized dietary recommendations to reduce average glucose iAUC across three time windows, which outperforms the previous approach. LLM-as-a-judge evaluation results indicate that our proposed method enhances personalization in the reasoning process.