To Move or Not to Move: Constraint-based Planning Enables Zero-Shot Generalization for Interactive Navigation

📅 2026-02-23

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the challenge of visual navigation failure in real-world environments where clutter completely obstructs pathways. To overcome this, the authors propose a novel framework that integrates large language models (LLMs) with constraint-based planning to actively modify the environment. The approach leverages structured scene graphs for reasoning and dynamically determines which obstacles to move, where to place them, and which regions to perceive next, coordinating with a low-level motion planner to execute sequences of navigation, grasping, placement, or detour actions. By uniquely combining LLM-driven constraint planning with active perception, the method achieves zero-shot generalization for lifelong interactive navigation, enabling long-horizon decision-making and dynamic environmental adaptation. It significantly outperforms existing baselines in the ProcTHOR-10k simulation environment and demonstrates qualitative success on a physical robot platform.

Technology Category

Application Category

📝 Abstract

Visual navigation typically assumes the existence of at least one obstacle-free path between start and goal, which must be discovered/planned by the robot. However, in real-world scenarios, such as home environments and warehouses, clutter can block all routes. Targeted at such cases, we introduce the Lifelong Interactive Navigation problem, where a mobile robot with manipulation abilities can move clutter to forge its own path to complete sequential object- placement tasks - each involving placing an given object (eg. Alarm clock, Pillow) onto a target object (eg. Dining table, Desk, Bed). To address this lifelong setting - where effects of environment changes accumulate and have long-term effects - we propose an LLM-driven, constraint-based planning framework with active perception. Our framework allows the LLM to reason over a structured scene graph of discovered objects and obstacles, deciding which object to move, where to place it, and where to look next to discover task-relevant information. This coupling of reasoning and active perception allows the agent to explore the regions expected to contribute to task completion rather than exhaustively mapping the environment. A standard motion planner then executes the corresponding navigate-pick-place, or detour sequence, ensuring reliable low-level control. Evaluated in physics-enabled ProcTHOR-10k simulator, our approach outperforms non-learning and learning-based baselines. We further demonstrate our approach qualitatively on real-world hardware.

Problem

Research questions and friction points this paper is trying to address.

Interactive Navigation

Obstacle Clutter

Path Planning

Object Manipulation

Lifelong Navigation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lifelong Interactive Navigation

Constraint-based Planning

LLM-driven Reasoning