Out-of-Distribution Generalization with a SPARC: Racing 100 Unseen Vehicles with a Single Policy

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the generalization challenge under out-of-distribution (OOD) environments in robot control. We propose SPARC, a single-stage adaptive framework that operates without explicit test-time contextual information. Unlike conventional two-stage paradigms—where context encoding and historical adaptation are decoupled—SPARC jointly learns implicit context inference and end-to-end policy optimization for online adaptation. We evaluate SPARC on the high-fidelity Gran Turismo 7 racing simulator and a wind-disturbed MuJoCo locomotion benchmark. Using a single unified policy, SPARC successfully navigates 100 unseen dynamical configurations and vehicle setups—demonstrating strong zero-shot OOD robustness and generalization. Our approach advances contextual reinforcement learning by introducing a more concise, computationally efficient, and scalable paradigm that eliminates reliance on explicit context inputs or multi-stage adaptation pipelines.

Technology Category

Application Category

📝 Abstract
Generalization to unseen environments is a significant challenge in the field of robotics and control. In this work, we focus on contextual reinforcement learning, where agents act within environments with varying contexts, such as self-driving cars or quadrupedal robots that need to operate in different terrains or weather conditions than they were trained for. We tackle the critical task of generalizing to out-of-distribution (OOD) settings, without access to explicit context information at test time. Recent work has addressed this problem by training a context encoder and a history adaptation module in separate stages. While promising, this two-phase approach is cumbersome to implement and train. We simplify the methodology and introduce SPARC: single-phase adaptation for robust control. We test SPARC on varying contexts within the high-fidelity racing simulator Gran Turismo 7 and wind-perturbed MuJoCo environments, and find that it achieves reliable and robust OOD generalization.
Problem

Research questions and friction points this paper is trying to address.

Achieving generalization to unseen environments in robotics and control
Handling out-of-distribution settings without explicit context information
Simplifying cumbersome two-phase approaches for robust control adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Single-phase adaptation for robust control
Simplifies two-stage context encoder training
Achieves reliable out-of-distribution generalization
🔎 Similar Papers
No similar papers found.
Bram Grooten
Bram Grooten
PhD candidate, Eindhoven University of Technology
deep learningreinforcement learning
Patrick MacAlpine
Patrick MacAlpine
Sony AI
K
Kaushik Subramanian
Sony AI
P
Peter Stone
Sony AI, The University of Texas at Austin
P
Peter R. Wurman
Sony AI