Vision-Conditioned Variational Bayesian Last Layer Dynamics Models

πŸ“… 2026-01-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of robotic systems losing control in dynamic environments due to an inability to anticipate how environmental changes affect system dynamics. To this end, we propose a vision-driven, context-aware dynamics modeling approach that integrates visual information as a conditioning signal into a variational Bayesian last-layer dynamics model. By combining feature-level affine transformation fine-tuning with visual-conditioned latent variable modeling, our method enables proactive adaptation to abrupt environmental shifts. Notably, this is the first framework to incorporate visual conditioning into Bayesian dynamics learning and embed it within an optimal controller. In real-world experiments involving a Lexus LC500 navigating through puddles, the proposed system successfully completed all 12 laps, whereas vision-free baselines consistently lost control, demonstrating the method’s superior effectiveness and robustness.

Technology Category

Application Category

πŸ“ Abstract
Agile control of robotic systems often requires anticipating how the environment affects system behavior. For example, a driver must perceive the road ahead to anticipate available friction and plan actions accordingly. Achieving such proactive adaptation within autonomous frameworks remains a challenge, particularly under rapidly changing conditions. Traditional modeling approaches often struggle to capture abrupt variations in system behavior, while adaptive methods are inherently reactive and may adapt too late to ensure safety. We propose a vision-conditioned variational Bayesian last-layer dynamics model that leverages visual context to anticipate changes in the environment. The model first learns nominal vehicle dynamics and is then fine-tuned with feature-wise affine transformations of latent features, enabling context-aware dynamics prediction. The resulting model is integrated into an optimal controller for vehicle racing. We validate our method on a Lexus LC500 racing through water puddles. With vision-conditioning, the system completed all 12 attempted laps under varying conditions. In contrast, all baselines without visual context consistently lost control, demonstrating the importance of proactive dynamics adaptation in high-performance applications.
Problem

Research questions and friction points this paper is trying to address.

vision-conditioned
dynamics adaptation
proactive control
autonomous systems
environmental uncertainty
Innovation

Methods, ideas, or system contributions that make the work stand out.

vision-conditioned dynamics
variational Bayesian inference
last-layer adaptation
proactive control
context-aware modeling
πŸ”Ž Similar Papers
No similar papers found.
P
Paul Brunzema
department of Mechanical Engineering, Aachen University, Germany
Thomas Lew
Thomas Lew
Toyota Research Institute
RoboticsOptimal ControlMachine Learning
R
Ray Zhang
Toyota Research Institute, 4440 El Camino Real, CA, USA
T
Takeru Shirasawa
Toyota Research Institute, 4440 El Camino Real, CA, USA
J
John K. Subosits
Toyota Research Institute, 4440 El Camino Real, CA, USA
Marcus Greiff
Marcus Greiff
Senior Research Scientist, Toyota Research Institute
Control TheoryEstimation TheoryNonlinear Control