🤖 AI Summary
This work addresses the challenge that large language models (LLMs) struggle to effectively accumulate evidence and perform Bayesian belief updating during multi-turn interactions, while existing approaches often rely on fine-tuning with sensitive user data, posing privacy risks. To overcome these limitations, the authors propose AdaptFuse—a training-free framework that decouples probabilistic reasoning from the frozen LLM by maintaining a discrete hypothesis space with explicit Bayesian posteriors via a symbolic module. This posterior is dynamically fused with the LLM’s semantic reasoning through an entropy-adaptive mechanism. AdaptFuse achieves, for the first time, Bayesian-style sequential preference learning without any model fine-tuning. Evaluated on flight, hotel, and e-commerce recommendation tasks across Gemma 2 9B, Llama 3 8B, and Qwen 2.5 7B, it consistently outperforms both prompting and fine-tuning baselines, with accuracy monotonically improving over interaction rounds while preserving user privacy and enabling dynamic information integration.
📝 Abstract
Large language models struggle to accumulate evidence across multiple rounds of user interaction, failing to update their beliefs in a manner consistent with Bayesian inference. Existing solutions require fine-tuning on sensitive user interaction data, limiting their applicability in privacy-conscious settings. We propose AdaptFuse, a training-free framework that externalizes probabilistic computation entirely from the LLM: a symbolic module maintains a Bayesian posterior over a discrete hypothesis set, while a frozen LLM contributes semantic reasoning via multi-sample Dirichlet aggregation. The two signals are combined through entropy-adaptive fusion, which automatically weights each source by its predictive confidence, shifting reliance from the LLM to the symbolic posterior as evidence accumulates. We evaluate across three domains: flight recommendation, hotel recommendation, and web shopping; on Gemma 2 9B, Llama 3 8B, and Qwen 2.5 7B. AdaptFuse consistently outperforms both prompting baselines and fine-tuned Bayesian Teaching models on all tasks, with accuracy improving monotonically over interaction rounds. These results demonstrate that principled inference-time algorithms can substitute for fine-tuning in personalized recommendation, without storing or training on sensitive user data. All the code and materials will be open-sourced.