Device-Conditioned Neural Architecture Search for Efficient Robotic Manipulation

📅 2026-04-11

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This work addresses the poor generalizability, redundant retraining requirements, and inefficiency of vision-based motor policies when deployed across heterogeneous robotic hardware. To this end, we propose DC-QFA, a unified framework featuring the first device-conditioned Once-for-All supernetwork architecture. Our approach integrates device-aware quantization-aware training, latency- and memory-aware regularization, lookup-table-based hardware-constrained neural architecture search, and multi-step on-policy distillation, enabling a single training run to produce lightweight policies adaptable to diverse platforms. Experiments demonstrate that DC-QFA achieves 2–3× speedup on edge devices, consumer-grade GPUs, and cloud platforms with negligible degradation in task success rates. Furthermore, real-robot evaluations confirm the long-term stability of low-bit policies in contact-intensive manipulation tasks.

Technology Category

Application Category

📝 Abstract

The growing complexity of visuomotor policies poses significant challenges for deployment with heterogeneous robotic hardware constraints. However, most existing model-efficient approaches for robotic manipulation are device- and model-specific, lack generalizability, and require time-consuming per-device optimization during the adaptation process. In this work, we propose a unified framework named \textbf{D}evice-\textbf{C}onditioned \textbf{Q}uantization-\textbf{F}or-\textbf{A}ll (DC-QFA) which amortizes deployment effort with the device-conditioned quantization-aware training and hardware-constrained architecture search. Specifically, we introduce a single supernet that spans a rich design space over network architectures and mixed-precision bit-widths. It is optimized with latency- and memory-aware regularization, guided by per-device lookup tables. With this supernet, for each target platform, we can perform a once-for-all lightweight search to select an optimal subnet without any per-device re-optimization, which enables more generalizable deployment across heterogeneous hardware, and substantially reduces deployment time. To improve long-horizon stability under low precision, we further introduce multi-step on-policy distillation to mitigate error accumulation during closed-loop execution. Extensive experiments on three representative policy backbones, such as DiffusionPolicy-T, MDT-V, and OpenVLA-OFT, demonstrate that our DC-QFA achieves $2\text{-}3\times$ acceleration on edge devices, consumer-grade GPUs, and cloud platforms, with negligible performance drop in task success. Real-world evaluations on an Inovo robot equipped with a force/torque sensor further validates that our low-bit DC-QFA policies maintain stable, contact-rich manipulation even under severe quantization.

Problem

Research questions and friction points this paper is trying to address.

neural architecture search

device heterogeneity

model efficiency

robotic manipulation

quantization

Innovation

Methods, ideas, or system contributions that make the work stand out.

device-conditioned

quantization-aware training

neural architecture search