🤖 AI Summary
To address the weak generalization and poor cross-scenario transferability of multi-task reinforcement learning (RL) in dynamic environments, this paper proposes a feature-driven hybrid Actor-Critic framework that unifies model-based RL (MBRL) and model-free RL (MFRL) for integrated planning, execution, and online learning. We introduce a novel feature-model guidance mechanism to enable on-demand subnetwork customization and dynamic environment modeling, supported by a modular neural network architecture. Experiments on urban and agricultural simulation benchmarks demonstrate that our approach improves task generalization performance by over 32% compared to state-of-the-art MBRL and MFRL baselines, while achieving a 2.1× speedup in inference efficiency. The method significantly enhances policy transferability and practical applicability across heterogeneous tasks and dynamically changing environments.
📝 Abstract
Model-based reinforcement learning (MBRL) and model-free reinforcement learning (MFRL) evolve along distinct paths but converge in the design of Dyna-Q [1]. However, modern RL methods still struggle with effective transferability across tasks and scenarios. Motivated by this limitation, we propose a generalized algorithm, Feature Model-Based Enhanced Actor-Critic (FM-EAC), that integrates planning, acting, and learning for multi-task control in dynamic environments. FM-EAC combines the strengths of MBRL and MFRL and improves generalizability through the use of novel feature-based models and an enhanced actor-critic framework. Simulations in both urban and agricultural applications demonstrate that FM-EAC consistently outperforms many state-of-the-art MBRL and MFRL methods. More importantly, different sub-networks can be customized within FM-EAC according to user-specific requirements.