GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning

📅 2025-08-14

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

Existing video-based generative robot learning approaches suffer from unstable generation quality, limited fine-grained manipulation capability, absence of environmental feedback integration, and scarcity of real-world demonstration data. To address these limitations, we propose GenFlowRL—a novel framework that introduces the first generative object-centric optical flow model. This model extracts low-dimensional, disentangled object motion representations from heterogeneous (simulated and real) visual data, enabling differentiable reward shaping. By unifying video generation, inverse dynamics modeling, and reinforcement learning, GenFlowRL mitigates the detrimental impact of video generation uncertainty on policy optimization. We evaluate GenFlowRL on ten diverse manipulation tasks spanning simulation and real-world settings. Results demonstrate significant improvements over state-of-the-art baselines, validating its generalization capability, robustness to domain shift, and cross-platform adaptability.

Technology Category

Application Category

📝 Abstract

Recent advances have shown that video generation models can enhance robot learning by deriving effective robot actions through inverse dynamics. However, these methods heavily depend on the quality of generated data and struggle with fine-grained manipulation due to the lack of environment feedback. While video-based reinforcement learning improves policy robustness, it remains constrained by the uncertainty of video generation and the challenges of collecting large-scale robot datasets for training diffusion models. To address these limitations, we propose GenFlowRL, which derives shaped rewards from generated flow trained from diverse cross-embodiment datasets. This enables learning generalizable and robust policies from diverse demonstrations using low-dimensional, object-centric features. Experiments on 10 manipulation tasks, both in simulation and real-world cross-embodiment evaluations, demonstrate that GenFlowRL effectively leverages manipulation features extracted from generated object-centric flow, consistently achieving superior performance across diverse and challenging scenarios. Our Project Page: https://colinyu1.github.io/genflowrl

Problem

Research questions and friction points this paper is trying to address.

Improving robot learning with generative object-centric flow

Addressing uncertainty in video generation for reinforcement learning

Enhancing policy robustness with diverse cross-embodiment datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses generative object-centric flow for rewards

Leverages cross-embodiment datasets for training

Extracts low-dimensional object-centric features

🔎 Similar Papers

GFlowNet Training by Policy Gradients