Vision-Conditioned Variational Bayesian Last Layer Dynamics Models

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the challenge of robotic systems losing control in dynamic environments due to an inability to anticipate how environmental changes affect system dynamics. To this end, we propose a vision-driven, context-aware dynamics modeling approach that integrates visual information as a conditioning signal into a variational Bayesian last-layer dynamics model. By combining feature-level affine transformation fine-tuning with visual-conditioned latent variable modeling, our method enables proactive adaptation to abrupt environmental shifts. Notably, this is the first framework to incorporate visual conditioning into Bayesian dynamics learning and embed it within an optimal controller. In real-world experiments involving a Lexus LC500 navigating through puddles, the proposed system successfully completed all 12 laps, whereas vision-free baselines consistently lost control, demonstrating the method’s superior effectiveness and robustness.

Technology Category

Application Category

📝 Abstract

Agile control of robotic systems often requires anticipating how the environment affects system behavior. For example, a driver must perceive the road ahead to anticipate available friction and plan actions accordingly. Achieving such proactive adaptation within autonomous frameworks remains a challenge, particularly under rapidly changing conditions. Traditional modeling approaches often struggle to capture abrupt variations in system behavior, while adaptive methods are inherently reactive and may adapt too late to ensure safety. We propose a vision-conditioned variational Bayesian last-layer dynamics model that leverages visual context to anticipate changes in the environment. The model first learns nominal vehicle dynamics and is then fine-tuned with feature-wise affine transformations of latent features, enabling context-aware dynamics prediction. The resulting model is integrated into an optimal controller for vehicle racing. We validate our method on a Lexus LC500 racing through water puddles. With vision-conditioning, the system completed all 12 attempted laps under varying conditions. In contrast, all baselines without visual context consistently lost control, demonstrating the importance of proactive dynamics adaptation in high-performance applications.

Problem

Research questions and friction points this paper is trying to address.

vision-conditioned

dynamics adaptation

proactive control

autonomous systems

environmental uncertainty

Innovation

Methods, ideas, or system contributions that make the work stand out.

vision-conditioned dynamics

variational Bayesian inference

last-layer adaptation