Device-Conditioned Neural Architecture Search for Efficient Robotic Manipulation

📅 2026-04-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the poor generalizability, redundant retraining requirements, and inefficiency of vision-based motor policies when deployed across heterogeneous robotic hardware. To this end, we propose DC-QFA, a unified framework featuring the first device-conditioned Once-for-All supernetwork architecture. Our approach integrates device-aware quantization-aware training, latency- and memory-aware regularization, lookup-table-based hardware-constrained neural architecture search, and multi-step on-policy distillation, enabling a single training run to produce lightweight policies adaptable to diverse platforms. Experiments demonstrate that DC-QFA achieves 2–3× speedup on edge devices, consumer-grade GPUs, and cloud platforms with negligible degradation in task success rates. Furthermore, real-robot evaluations confirm the long-term stability of low-bit policies in contact-intensive manipulation tasks.

Technology Category

Application Category

📝 Abstract
The growing complexity of visuomotor policies poses significant challenges for deployment with heterogeneous robotic hardware constraints. However, most existing model-efficient approaches for robotic manipulation are device- and model-specific, lack generalizability, and require time-consuming per-device optimization during the adaptation process. In this work, we propose a unified framework named \textbf{D}evice-\textbf{C}onditioned \textbf{Q}uantization-\textbf{F}or-\textbf{A}ll (DC-QFA) which amortizes deployment effort with the device-conditioned quantization-aware training and hardware-constrained architecture search. Specifically, we introduce a single supernet that spans a rich design space over network architectures and mixed-precision bit-widths. It is optimized with latency- and memory-aware regularization, guided by per-device lookup tables. With this supernet, for each target platform, we can perform a once-for-all lightweight search to select an optimal subnet without any per-device re-optimization, which enables more generalizable deployment across heterogeneous hardware, and substantially reduces deployment time. To improve long-horizon stability under low precision, we further introduce multi-step on-policy distillation to mitigate error accumulation during closed-loop execution. Extensive experiments on three representative policy backbones, such as DiffusionPolicy-T, MDT-V, and OpenVLA-OFT, demonstrate that our DC-QFA achieves $2\text{-}3\times$ acceleration on edge devices, consumer-grade GPUs, and cloud platforms, with negligible performance drop in task success. Real-world evaluations on an Inovo robot equipped with a force/torque sensor further validates that our low-bit DC-QFA policies maintain stable, contact-rich manipulation even under severe quantization.
Problem

Research questions and friction points this paper is trying to address.

neural architecture search
device heterogeneity
model efficiency
robotic manipulation
quantization
Innovation

Methods, ideas, or system contributions that make the work stand out.

device-conditioned
quantization-aware training
neural architecture search
once-for-all deployment
on-policy distillation
🔎 Similar Papers
No similar papers found.