Interpretable Probability Estimation with LLMs via Shapley Reconstruction

πŸ“… 2026-01-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes PRISM, a novel framework that addresses the limitations of large language models (LLMs) in high-stakes domains such as finance and healthcare, where noisy probability estimates and opaque decision-making hinder reliable deployment. PRISM introduces Shapley values into LLM-based probabilistic prediction for the first time, enabling factor-level interpretability by quantifying the marginal contribution of each input feature. Leveraging these contributions, the framework recalibrates the model’s output probabilities to enhance both accuracy and transparency. Empirical evaluations across finance, healthcare, and agriculture demonstrate that PRISM significantly outperforms standard prompting and other baseline methods. Furthermore, by visualizing the distribution of feature contributions, PRISM fosters greater user trust in model decisions without compromising predictive performance.

Technology Category

Application Category

πŸ“ Abstract
Large Language Models (LLMs) demonstrate potential to estimate the probability of uncertain events, by leveraging their extensive knowledge and reasoning capabilities. This ability can be applied to support intelligent decision-making across diverse fields, such as financial forecasting and preventive healthcare. However, directly prompting LLMs for probability estimation faces significant challenges: their outputs are often noisy, and the underlying predicting process is opaque. In this paper, we propose PRISM: Probability Reconstruction via Shapley Measures, a framework that brings transparency and precision to LLM-based probability estimation. PRISM decomposes an LLM's prediction by quantifying the marginal contribution of each input factor using Shapley values. These factor-level contributions are then aggregated to reconstruct a calibrated final estimate. In our experiments, we demonstrate PRISM improves predictive accuracy over direct prompting and other baselines, across multiple domains including finance, healthcare, and agriculture. Beyond performance, PRISM provides a transparent prediction pipeline: our case studies visualize how individual factors shape the final estimate, helping build trust in LLM-based decision support systems.
Problem

Research questions and friction points this paper is trying to address.

probability estimation
Large Language Models
interpretability
Shapley values
decision support
Innovation

Methods, ideas, or system contributions that make the work stand out.

Shapley values
probability estimation
interpretable AI
large language models
model calibration
πŸ”Ž Similar Papers
Y
Yang Nan
Q
Qihao Wen
J
Jiahao Wang
Pengfei He
Pengfei He
PhD student, Michigan State University
Machine LearningTrustworthyAIHigh-dimensional StatisticsCausal Mediation Analysis
R
Ravi Tandon
Y
Yong Ge
H
Han Xu