Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This work addresses key challenges in long-horizon, multi-step routing of deformable linear objects (e.g., cables, ropes): difficulty in dynamic modeling, weak high-level planning, and poor fault tolerance. We propose a hierarchical framework synergizing vision-language models (VLMs) and reinforcement learning (RL). At the high level, a VLM interprets natural-language instructions and performs context-aware, multi-step skill planning; at the low level, RL trains robust, executable manipulation policies. Crucially, we introduce a state redirection mechanism enabling online recovery and adaptive plan adjustment during execution. Evaluated on diverse long-horizon routing tasks, our approach achieves a 92.5% overall success rate—nearly 50 percentage points higher than the best baseline—demonstrating substantial improvements in generalization across complex scenes, compositional reasoning, and system robustness.

Technology Category

Application Category

📝 Abstract

Long-horizon routing tasks of deformable linear objects (DLOs), such as cables and ropes, are common in industrial assembly lines and everyday life. These tasks are particularly challenging because they require robots to manipulate DLO with long-horizon planning and reliable skill execution. Successfully completing such tasks demands adapting to their nonlinear dynamics, decomposing abstract routing goals, and generating multi-step plans composed of multiple skills, all of which require accurate high-level reasoning during execution. In this paper, we propose a fully autonomous hierarchical framework for solving challenging DLO routing tasks. Given an implicit or explicit routing goal expressed in language, our framework leverages vision-language models~(VLMs) for in-context high-level reasoning to synthesize feasible plans, which are then executed by low-level skills trained via reinforcement learning. To improve robustness in long horizons, we further introduce a failure recovery mechanism that reorients the DLO into insertion-feasible states. Our approach generalizes to diverse scenes involving object attributes, spatial descriptions, as well as implicit language commands. It outperforms the next best baseline method by nearly 50% and achieves an overall success rate of 92.5% across long-horizon routing scenarios.

Problem

Research questions and friction points this paper is trying to address.

Robots manipulate deformable linear objects with long-horizon planning

Adapt to nonlinear dynamics and decompose abstract routing goals

Generate multi-step plans using vision-language models and reinforcement learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical framework with vision-language models for reasoning

Reinforcement learning trains low-level skills for execution

Failure recovery mechanism enhances long-horizon robustness

🔎 Similar Papers

No similar papers found.