VISION-SLS: Safe Perception-Based Control from Learned Visual Representations via System Level Synthesis

📅 2026-04-27

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work addresses the challenge of high-dimensional visual feedback control under partial observability, sensor noise, and nonlinear dynamics by proposing a safe output-feedback method that integrates pretrained visual representations with System Level Synthesis (SLS). The approach constructs a low-dimensional observation map with state-dependent error bounds and introduces an efficient solver for the resulting nonconvex SLS problem by combining sequential convex programming with Riccati recursion. This enables, for the first time, end-to-end integration of learnable visual features and safety-guaranteed control. Extensive evaluations on simulated and real-world platforms—including a 4D car, a 10D quadrotor, and a 59D humanoid robot—demonstrate significant improvements in computational speed and safety, effectively satisfying state constraints while actively mitigating uncertainty.

📝 Abstract

We propose VISION-SLS, a method for nonlinear output-feedback control from high-resolution RGB images which provides robust constraint satisfaction guarantees under calibrated uncertainty bounds despite partial observability, sensor noise, and nonlinear dynamics. To enable scalability while retaining guarantees, we propose: (i) a learned low-dimensional observation map from pretrained visual features with state-dependent error bounds, and (ii) a causal affine time-varying output-feedback policy optimized via System Level Synthesis (SLS). We develop a scalable, novel solver for the resulting nonconvex program that leverages sequential convex programming coupled with efficient Riccati recursions. On two simulated visuomotor tasks (a 4D car and a 10D quadrotor) with >= 512 x 512 pixels and a 59D humanoid task with partial observability, our method enables safe, information-gathering behavior that reduces uncertainty while guaranteeing constraint satisfaction with empirically-calibrated error bounds. We also validate our method on hardware, safely controlling a ground vehicle from onboard images, outperforming baselines in safety rate and solve times. Together, these results show that learned visual abstractions coupled with an efficient solver make SLS-based safe visuomotor output-feedback practical at scale. The code implementation of our method is available at https://github.com/trustworthyrobotics/VISION-SLS.

Problem

Research questions and friction points this paper is trying to address.

safe perception-based control

output-feedback

partial observability

sensor noise

nonlinear dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

System Level Synthesis

learned visual representations

output-feedback control