Optimal Control of the Future via Prospective Foraging

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the inefficiency of existing reinforcement learning and online learning methods in non-stationary environments, this paper proposes the “proactive control” framework—the first to extend PAC learning theory to dynamic, controlled non-stationary settings. The framework models non-stationarity explicitly and employs empirical risk minimization (ERM), rigorously proving that ERM asymptotically converges to the Bayesian-optimal policy. It further yields a proactive control algorithm with formal theoretical guarantees. Key contributions are: (1) establishing a PAC-control theory oriented toward future-optimal decision-making; (2) relaxing the classical PAC assumption of environment stationarity; and (3) demonstrating, on canonical foraging tasks, that proactive agents outperform state-of-the-art RL algorithms by several orders of magnitude—validating both theoretical soundness and practical efficacy.

Technology Category

Application Category

📝 Abstract
Optimal control of the future is the next frontier for AI. Current approaches to this problem are typically rooted in either reinforcement learning or online learning. While powerful, these frameworks for learning are mathematically distinct from Probably Approximately Correct (PAC) learning, which has been the workhorse for the recent technological achievements in AI. We therefore build on the prior work of prospective learning, an extension of PAC learning (without control) in non-stationary environments (De Silva et al., 2023; Silva et al., 2024; Bai et al., 2026). Here, we further extend the PAC learning framework to address learning and control in non-stationary environments. Using this framework, called''Prospective Control'', we prove that under certain fairly general assumptions, empirical risk minimization (ERM) asymptotically achieves the Bayes optimal policy. We then consider a specific instance of prospective control, foraging, which is a canonical task for any mobile agent, be it natural or artificial. We illustrate that existing reinforcement learning algorithms fail to learn in these non-stationary environments, and even with modifications, they are orders of magnitude less efficient than our prospective foraging agents. Code is available at: https://github.com/neurodata/ProspectiveLearningwithControl.
Problem

Research questions and friction points this paper is trying to address.

Extending PAC learning to handle control in non-stationary environments
Developing prospective control framework for optimal future decision-making
Addressing foraging task failures of RL algorithms in dynamic settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends PAC learning to non-stationary environments
Uses empirical risk minimization for optimal policy
Introduces prospective control for efficient foraging agents
🔎 Similar Papers
No similar papers found.
Y
Yuxin Bai
Johns Hopkins University
A
Aranyak Acharyya
Johns Hopkins University
Ashwin De Silva
Ashwin De Silva
PhD Student, Johns Hopkins University
Deep LearningMachine LearningComputer VisionStatistics
Zeyu Shen
Zeyu Shen
Institute of Software, Chinese Academy of Sciences
Computer GraphicsGeometry Modeling
J
James Hassett
Johns Hopkins University
J
J. Vogelstein
Johns Hopkins University