$pi^{*}_{0.6}$: a VLA That Learns From Experience

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of continual self-adaptation for vision-language-action (VLA) models in real-world deployment. We propose RECAP, an advantage-conditioned policy framework that unifies offline demonstration data, online robot interaction data, and expert teleoperation interventions via offline pretraining followed by online closed-loop optimization—enabling multi-source, heterogeneous data-driven policy refinement. Its core innovation lies in modeling the policy distribution conditioned on the advantage function, thereby enhancing robustness to task dynamics and environmental perturbations. Evaluated in real household environments, RECAP successfully executes complex, long-horizon tasks—including clothing folding, cardboard box assembly, and professional coffee machine operation—achieving over a 2.1× throughput improvement and a 48% reduction in failure rate on the most challenging tasks.

Technology Category

Application Category

📝 Abstract
We study how vision-language-action (VLA) models can improve through real-world deployments via reinforcement learning (RL). We present a general-purpose method, RL with Experience and Corrections via Advantage-conditioned Policies (RECAP), that provides for RL training of VLAs via advantage conditioning. Our method incorporates heterogeneous data into the self-improvement process, including demonstrations, data from on-policy collection, and expert teleoperated interventions provided during autonomous execution. RECAP starts by pre-training a generalist VLA with offline RL, which we call $pi^{*}_{0.6}$, that can then be specialized to attain high performance on downstream tasks through on-robot data collection. We show that the $pi^{*}_{0.6}$ model trained with the full RECAP method can fold laundry in real homes, reliably assemble boxes, and make espresso drinks using a professional espresso machine. On some of the hardest tasks, RECAP more than doubles task throughput and roughly halves the task failure rate.
Problem

Research questions and friction points this paper is trying to address.

Improving vision-language-action models through real-world reinforcement learning deployments
Incorporating heterogeneous data sources for VLA self-improvement process
Enhancing robot task performance on complex real-world manipulation tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-language-action model learns via reinforcement learning
RECAP method incorporates heterogeneous data sources
Pre-trained VLA specializes through on-robot data collection
🔎 Similar Papers
No similar papers found.
Ali Amin
Ali Amin
Physical Intelligence
R
Raichelle J. Aniceto
Physical Intelligence
Ashwin Balakrishna
Ashwin Balakrishna
Physical Intelligence
RoboticsMachine LearningReinforcement LearningImitation Learning
K
Kevin Black
Physical Intelligence
K
Ken Conley
Physical Intelligence
G
Grace Connors
Physical Intelligence
J
James Darpinian
Physical Intelligence
K
Karan Dhabalia
Physical Intelligence
J
Jared DiCarlo
Physical Intelligence
Danny Driess
Danny Driess
Google DeepMind
Machine LearningRobotics
Michael Equi
Michael Equi
UC Berkeley
Machine learningrobot learning
A
Adnan Esmail
Physical Intelligence
Yunhao Fang
Yunhao Fang
Research Scientist @ ByteDance
PerceptionDecision Making
Chelsea Finn
Chelsea Finn
Stanford University, Physical Intelligence
machine learningroboticsreinforcement learning
C
Catherine Glossop
Physical Intelligence
T
Thomas Godden
Physical Intelligence
I
Ivan Goryachev
Physical Intelligence
L
Lachy Groom
Physical Intelligence
H
Hunter Hancock
Physical Intelligence
Karol Hausman
Karol Hausman
Physical Intelligence, Stanford
machine learningroboticsreinforcement learning
G
Gashon Hussein
Physical Intelligence
Brian Ichter
Brian Ichter
Physical Intelligence
RoboticsMachine LearningFoundation Models
S
Szymon Jakubczak
Physical Intelligence
R
Rowan Jen
Physical Intelligence
T
Tim Jones
Physical Intelligence
B
Ben Katz
Physical Intelligence
Liyiming Ke
Liyiming Ke
Physical Intelligence
C
Chandra Kuchi
Physical Intelligence
M
Marinda Lamb
Physical Intelligence
D
Devin LeBlanc
Physical Intelligence
Sergey Levine
Sergey Levine
UC Berkeley, Physical Intelligence
Machine LearningRoboticsReinforcement Learning
A
Adrian Li-Bell
Physical Intelligence
Y
Yao Lu
Physical Intelligence
V
Vishnu Mano
Physical Intelligence
M
Mohith Mothukuri
Physical Intelligence
S
Suraj Nair
Physical Intelligence
Karl Pertsch
Karl Pertsch
UC Berkeley, Stanford University
Artificial IntelligenceMachine LearningRobotics
Allen Z. Ren
Allen Z. Ren
Physical Intelligence
RoboticsMachine Learning
C
Charvi Sharma
Physical Intelligence
L
L. Shi
Physical Intelligence
Laura Smith
Laura Smith
Physical Intelligence
Jost Tobias Springenberg
Jost Tobias Springenberg
Google DeepMind
Machine Learning
Kyle Stachowicz
Kyle Stachowicz
UC Berkeley
Reinforcement LearningLearning-based ControlRobotics
W
Will Stoeckle
Physical Intelligence
A
Alex Swerdlow
Physical Intelligence
James Tanner
James Tanner
University of Glasgow
phoneticsphonologycorpus linguistics
Marcel Torne
Marcel Torne
Researcher, Stanford University
RoboticsMachine LearningReinforcement LearningDeep Reinforcement Learning
Quan Vuong
Quan Vuong
Physical Intelligence
Reinforcement LearningComputer Vision
A
Anna Walling
Physical Intelligence
H
Haohuan Wang
Physical Intelligence
B
Blake Williams
Physical Intelligence
S
Sukwon Yoo
Physical Intelligence
Lili Yu
Lili Yu
Meta AI
Natural Language ProcessingMachine learning
U
Ury Zhilinsky
Physical Intelligence
Zhiyuan Zhou
Zhiyuan Zhou
PhD student, UC Berkeley
RoboticsReinforcement Learning