Drift is a Sampling Error: SNR-Aware Power Distributions for Long-Horizon Robotic Planning

📅 2026-05-10
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
This work addresses the instruction drift problem in vision-language-action models during long-horizon robotic tasks, which arises from locally greedy sampling strategies. To mitigate this issue, the authors propose Context-Aware Power Sampling (CAPS), a training-free, inference-time adaptive computation framework. CAPS enhances global trajectory likelihood using a power distribution and incorporates a signal-to-noise ratio (SNR)-based metacognitive mechanism that dynamically triggers MCMC search upon detecting drift risk, enabling seamless transitions from fast intuitive decisions to slow deliberative planning. By formally modeling instruction drift as a systematic sampling bias and integrating power-law reweighting with SNR-aware monitoring, CAPS significantly improves robustness in extended tasks. Experiments demonstrate consistent and substantial performance gains over strong baselines—including OpenVLA and TACO—across RoboTwin, Simpler-WindowX, and Libero-long benchmarks.
📝 Abstract
Despite rapid progress in Vision-Language-Action (VLA) models for robotic control, instruction drift remains a persistent failure mode in long-horizon tasks. This paper reconceptualizes this phenomenon, positing that instruction drift is fundamentally a systematic sampling error: local greedy sampling is prone to collapsing into "Negative Pivotal Windows"--irreversible local optima with high local probability that sever global success pathways. To address this, we propose Context-Aware Power Sampling (CAPS), a training-free inference-time computation framework. CAPS leverages power distributions to sharpen global trajectory probabilities, enabling lookahead search over the model's conditional generative trajectory distribution. Furthermore, we introduce a metacognitive control mechanism based on Signal-to-Noise Ratio (SNR). This mechanism triggers adaptive MCMC search solely when drift risk is detected, enabling a dynamic transition from "intuitive fast thinking" to "rational slow search." Experiments on RoboTwin, Simpler-WindowX, and Libero-long benchmarks show that CAPS achieves substantial improvements over strong baselines, including OpenVLA and TACO, without parameter updates. These results support the effectiveness of adaptive inference-time computation for improving long-horizon robustness in embodied control.
Problem

Research questions and friction points this paper is trying to address.

instruction drift
sampling error
long-horizon robotic planning
local optima
embodied control
Innovation

Methods, ideas, or system contributions that make the work stand out.

instruction drift
power distributions
SNR-aware control
adaptive inference
long-horizon planning
🔎 Similar Papers