Don't Run with Scissors: Pruning Breaks VLA Models but They Can Be Recovered

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pruning vision-language-action (VLA) models for deployment on resource-constrained devices often incurs severe performance degradation and a sharp increase in safety violations, while existing mitigation strategies rely heavily on costly retraining. To address this, we propose GLUESTICK—a zero-shot, training-free recovery method that reconstructs functionality lost during pruning. GLUESTICK performs a one-time interpolation between dense and pruned model weights in parameter space, generating layer-wise correction terms applied during inference to precisely compensate for functional deficits—without altering the sparse architecture. The approach is broadly applicable across pruning strategies and requires tuning only a single hyperparameter. Extensive experiments across diverse VLA architectures and robotic manipulation/navigation tasks demonstrate that GLUESTICK significantly improves task success rates, reduces safety violations, and preserves memory efficiency—achieving a robust balance between model sparsity and behavioral fidelity.

Technology Category

Application Category

📝 Abstract
Vision-Language-Action (VLA) models have advanced robotic capabilities but remain challenging to deploy on resource-limited hardware. Pruning has enabled efficient compression of large language models (LLMs), yet it is largely understudied in robotics. Surprisingly, we observe that pruning VLA models leads to drastic degradation and increased safety violations. We introduce GLUESTICK, a post-pruning recovery method that restores much of the original model's functionality while retaining sparsity benefits. Our method performs a one-time interpolation between the dense and pruned models in weight-space to compute a corrective term. This correction is used during inference by each pruned layer to recover lost capabilities with minimal overhead. GLUESTICK requires no additional training, is agnostic to the pruning algorithm, and introduces a single hyperparameter that controls the tradeoff between efficiency and accuracy. Across diverse VLA architectures and tasks in manipulation and navigation, GLUESTICK achieves competitive memory efficiency while substantially recovering success rates and reducing safety violations. Additional material can be found at: https://gluestick-vla.github.io/.
Problem

Research questions and friction points this paper is trying to address.

Pruning causes severe performance degradation in VLA models
VLA models suffer increased safety violations after pruning
GLUESTICK recovers pruned VLA functionality while maintaining sparsity
Innovation

Methods, ideas, or system contributions that make the work stand out.

GLUESTICK recovers pruned VLA models via weight interpolation
Method requires no training and is pruning-algorithm agnostic
Uses corrective term during inference to restore functionality
🔎 Similar Papers
No similar papers found.