🤖 AI Summary
Large language models (LLMs) lack explicit uncertainty modeling and latent structural representation capabilities, limiting their cognitive simulation performance in compositional reasoning.
Method: We propose Natural-Language Bayesian Prompting—a novel zero-shot prompting framework that linguistically encodes core probabilistic graphical model (PGM) principles—namely variable dependence, conditional independence, and Bayesian updating—into natural-language prompts, without fine-tuning or expert-designed PGMs. The method integrates Bayesian prompt engineering, chain-structured reasoning guidance, and confidence calibration.
Results: Experiments demonstrate significant improvements in answer quality and confidence calibration accuracy across diverse compositional reasoning tasks. The framework exhibits strong robustness in few-shot and open-domain settings. By grounding LLM inference in interpretable, calibrated probabilistic reasoning, it advances the integration of formal uncertainty quantification into large-language-model-based reasoning systems.
📝 Abstract
Human cognition excels at transcending sensory input and forming latent representations that structure our understanding of the world. Although Large Language Models (LLMs) can produce chain-of-thought reasoning, they lack a principled framework to capture latent structures and model uncertainty, especially in compositional reasoning tasks. We propose Verbalized Probabilistic Graphical Modeling (vPGM), a Bayesian prompting framework that guides LLMs to simulate key principles of Probabilistic Graphical Models (PGMs) in natural language. Unlike many traditional probabilistic methods requiring substantial domain expertise or specialized training, vPGM bypasses expert-driven model design, making it well-suited for scenarios with limited assumptions or scarce data. We evaluated our model on several compositional reasoning tasks, both close-ended and open-ended. Our results indicate that the model effectively enhances confidence calibration and text generation quality.