FlowRL: A Taxonomy and Modular Framework for Reinforcement Learning with Diffusion Policies

📅 2026-03-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion and flow models in reinforcement learning currently lack a unified taxonomy and an efficient training framework, and their absence of explicit log-probability densities hinders compatibility with conventional policy gradient methods. This work proposes the first systematic classification scheme tailored to such policies and introduces a modular, open-source framework built on JAX that supports just-in-time (JIT) compilation and high-throughput training. We establish standardized benchmarks across Gym-Locomotion, DeepMind Control Suite, and IsaacLab, offering reproducible empirical comparisons and practical guidance for algorithm selection. This infrastructure substantially enhances prototyping efficiency and research reproducibility in the emerging intersection of generative modeling and reinforcement learning.
📝 Abstract
Thanks to their remarkable flexibility, diffusion models and flow models have emerged as promising candidates for policy representation. However, efficient reinforcement learning (RL) upon these policies remains a challenge due to the lack of explicit log-probabilities for vanilla policy gradient estimators. While numerous attempts have been proposed to address this, the field lacks a unified perspective to reconcile these seemingly disparate methods, thus hampering ongoing development. In this paper, we bridge this gap by introducing a comprehensive taxonomy for RL algorithms with diffusion/flow policies. To support reproducibility and agile prototyping, we introduce a modular, JAX-based open-source codebase that leverages JIT-compilation for high-throughput training. Finally, we provide systematic and standardized benchmarks across Gym-Locomotion, DeepMind Control Suite, and IsaacLab, offering a rigorous side-by-side comparison of diffusion-based methods and guidance for practitioners to choose proper algorithms based on the application. Our work establishes a clear foundation for understanding and algorithm design, a high-efficiency toolkit for future research in the field, and an algorithmic guideline for practitioners in generative models and robotics. Our code is available at https://github.com/typoverflow/flow-rl.
Problem

Research questions and friction points this paper is trying to address.

reinforcement learning
diffusion policies
flow models
policy gradient
taxonomy
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion policies
flow models
reinforcement learning
modular framework
taxonomy
🔎 Similar Papers
No similar papers found.
C
Chenxiao Gao
Georgia Institute of Technology
E
Edward Chen
Georgia Institute of Technology
T
Tianyi Chen
Georgia Institute of Technology
Bo Dai
Bo Dai
Google Brain & Georgia Tech
Machine Learning