{lambda}: A Benchmark for Data-Efficiency in Long-Horizon Indoor Mobile Manipulation Robotics

📅 2024-11-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Data inefficiency hinders language-driven pick-and-place in complex indoor mobile manipulation (MoMa) tasks. Method: We introduce λ, the first long-horizon MoMa benchmark explicitly designed for data-efficiency evaluation. Built upon 571 real human demonstrations spanning cross-room and cross-floor scenarios, λ supports both simulation and real-world transfer. We propose a lightweight, high-fidelity benchmark construction paradigm centered on human demonstrations, and a neuro-symbolic modular architecture integrating foundation models, symbolic task planning, and motion planning—evaluated against behavioral cloning and reinforcement learning baselines. Contribution/Results: Pure learning methods exhibit poor data efficiency; in contrast, our neuro-symbolic approach achieves over 40% absolute success rate improvement with only a few demonstrations, significantly enhancing robustness. λ has become the community standard for evaluating data efficiency in MoMa.

Technology Category

Application Category

📝 Abstract
Efficiently learning and executing long-horizon mobile manipulation (MoMa) tasks is crucial for advancing robotics in household and workplace settings. However, current MoMa models are data-inefficient, underscoring the need for improved models that require realistic-sized benchmarks to evaluate their efficiency, which do not exist. To address this, we introduce the LAMBDA ({lambda}) benchmark (Long-horizon Actions for Mobile-manipulation Benchmarking of Directed Activities), which evaluates the data efficiency of models on language-conditioned, long-horizon, multi-room, multi-floor, pick-and-place tasks using a dataset of manageable size, more feasible for collection. The benchmark includes 571 human-collected demonstrations that provide realism and diversity in simulated and real-world settings. Unlike planner-generated data, these trajectories offer natural variability and replay-verifiability, ensuring robust learning and evaluation. We benchmark several models, including learning-based models and a neuro-symbolic modular approach combining foundation models with task and motion planning. Learning-based models show suboptimal success rates, even when leveraging pretrained weights, underscoring significant data inefficiencies. However, the neuro-symbolic approach performs significantly better while being more data efficient. Findings highlight the need for more data-efficient learning-based MoMa approaches. {lambda} addresses this gap by serving as a key benchmark for evaluating the data efficiency of those future models in handling household robotics tasks.
Problem

Research questions and friction points this paper is trying to address.

Data Efficiency
Complex Mobile Manipulation
Language Instructions
Innovation

Methods, ideas, or system contributions that make the work stand out.

LAMBDA Test Standard
Neuro-Symbolic Methods
Data Efficiency
🔎 Similar Papers
No similar papers found.