Neural ODE and SDE Models for Adaptation and Planning in Model-Based Reinforcement Learning

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of policy adaptation and planning in partially observable and dynamically stochastic environments by proposing an action-conditional latent stochastic differential equation (SDE) model. The approach integrates neural ordinary differential equations (ODEs), generative adversarial networks (GANs), and inverse dynamics models to efficiently capture environmental transition dynamics in a latent space. By leveraging this structured latent representation, the method enables rapid policy adaptation with minimal interaction in new environments and facilitates efficient model-based planning. Evaluated on multiple stochastic continuous control benchmarks, the proposed framework achieves competitive or superior performance compared to state-of-the-art model-based and model-free reinforcement learning algorithms in terms of both sample efficiency and policy performance.

Technology Category

Application Category

📝 Abstract
We investigate neural ordinary and stochastic differential equations (neural ODEs and SDEs) to model stochastic dynamics in fully and partially observed environments within a model-based reinforcement learning (RL) framework. Through a sequence of simulations, we show that neural SDEs more effectively capture the inherent stochasticity of transition dynamics, enabling high-performing policies with improved sample efficiency in challenging scenarios. We leverage neural ODEs and SDEs for efficient policy adaptation to changes in environment dynamics via inverse models, requiring only limited interactions with the new environment. To address partial observability, we introduce a latent SDE model that combines an ODE with a GAN-trained stochastic component in latent space. Policies derived from this model provide a strong baseline, outperforming or matching general model-based and model-free approaches across stochastic continuous-control benchmarks. This work demonstrates the applicability of action-conditional latent SDEs for RL planning in environments with stochastic transitions. Our code is available at: https://github.com/ChaoHan-UoS/NeuralRL
Problem

Research questions and friction points this paper is trying to address.

stochastic dynamics
model-based reinforcement learning
partial observability
policy adaptation
planning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural SDE
Model-based Reinforcement Learning
Latent SDE
Policy Adaptation
Partial Observability
🔎 Similar Papers
No similar papers found.