RoboMatch: A Mobile-Manipulation Teleoperation Platform with Auto-Matching Network Architecture for Long-Horizon Manipulation

📅 2025-09-10

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

To address performance degradation, inefficient data collection, low task accuracy, and control instability in long-horizon teleoperation under dynamic environments, this paper proposes a mobile dual-arm teleoperation platform tailored for extended-visual-field manipulation. The platform features an integrated cockpit-style interface enabling coordinated control of the mobile base and dual robotic arms. We design an Auto-Matching network architecture that decomposes long-sequence tasks into logical subtasks and dynamically orchestrates lightweight pre-trained models for distributed inference. Additionally, we introduce an ontology-visual enhanced diffusion strategy, integrating multi-scale feature extraction via discrete wavelet transform (DWT) with high-precision end-effector IMU feedback to improve closed-loop control accuracy. Experiments demonstrate a >20% increase in data collection efficiency, a 20–30% improvement in task success rate, and ~40% enhancement in long-horizon inference performance—significantly boosting system robustness and practicality in complex, dynamic scenarios.

Technology Category

Application Category

📝 Abstract

This paper presents RoboMatch, a novel unified teleoperation platform for mobile manipulation with an auto-matching network architecture, designed to tackle long-horizon tasks in dynamic environments. Our system enhances teleoperation performance, data collection efficiency, task accuracy, and operational stability. The core of RoboMatch is a cockpit-style control interface that enables synchronous operation of the mobile base and dual arms, significantly improving control precision and data collection. Moreover, we introduce the Proprioceptive-Visual Enhanced Diffusion Policy (PVE-DP), which leverages Discrete Wavelet Transform (DWT) for multi-scale visual feature extraction and integrates high-precision IMUs at the end-effector to enrich proprioceptive feedback, substantially boosting fine manipulation performance. Furthermore, we propose an Auto-Matching Network (AMN) architecture that decomposes long-horizon tasks into logical sequences and dynamically assigns lightweight pre-trained models for distributed inference. Experimental results demonstrate that our approach improves data collection efficiency by over 20%, increases task success rates by 20-30% with PVE-DP, and enhances long-horizon inference performance by approximately 40% with AMN, offering a robust solution for complex manipulation tasks.

Problem

Research questions and friction points this paper is trying to address.

Enhancing teleoperation for long-horizon mobile manipulation tasks

Improving data collection efficiency and task success rates

Decomposing complex tasks with auto-matching network architecture

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cockpit-style control interface for synchronous mobile base and dual arms

Proprioceptive-Visual Enhanced Diffusion Policy with DWT feature extraction

Auto-Matching Network architecture for distributed inference on task sequences

🔎 Similar Papers

Whole-Body Teleoperation for Mobile Manipulation at Zero Added Cost