🤖 AI Summary
This work addresses the lack of computable criteria for evaluating whether mechanistic priors genuinely improve sample efficiency in sequential decision-making. We introduce the concept of “mechanistic information,” quantifying the mutual information between a model’s recommended policy and the true optimal policy, and combine it with an occupancy-weighted bias metric to establish a matching bound between Bayesian regret and residual entropy. This framework provides both theoretical characterization and empirical validation of the value of mechanistic priors. Through hybrid mechanistic modeling, information-theoretic analysis, and pharmacokinetic simulations on a 5-FU dosing task, we demonstrate that mechanistic priors substantially enhance early-stage sample efficiency, whereas large language model priors incur significant mechanistic information loss in safety-critical settings.
📝 Abstract
Hybrid mechanistic models, physical priors with learned residuals, promise to reduce the data required for good decisions, but have no computable criterion to test this. We characterize the value of mechanistic priors in sequential decision-making within both asymptotic and burn-in regimes. To formalize this, we introduce the mechanistic information of a model -- the mutual information between the model's recommended policy $\hatπ$ and the true optimal policy $π^*$ -- quantified via an occupancy-weighted bias $B_μ$. In the asymptotic regime (large $N$), matched bounds reveal that Bayesian regret scales with the residual entropy $H_{\mathrm{mech}}$, delivering a theoretical sample complexity reduction of $H(μ)/H_{\mathrm{mech}}$ compared to an uninformed baseline. Furthermore, we provide a model certificate to determine empirical sample efficiency. Complementarily, in the clinically relevant burn-in regime (small $N$), we establish a lower bound on the penalty incurred by confidently wrong priors. We demonstrate both the asymptotic and burn-in bounds across 5-fluorouracil (5-FU) dosing simulations motivated by published FOLFOX pharmacokinetic data, where a hybrid prior yields large sample-efficiency gains in the burn-in regime. Finally, we contrast these grounded models with LLM priors, demonstrating that LLMs can suffer severe losses in mechanistic information, thereby motivating the exclusive use of physically-grounded priors for safety-critical applications.