{lambda}: A Benchmark for Data-Efficiency in Long-Horizon Indoor Mobile Manipulation Robotics

📅 2024-11-28

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

Data inefficiency hinders language-driven pick-and-place in complex indoor mobile manipulation (MoMa) tasks. Method: We introduce λ, the first long-horizon MoMa benchmark explicitly designed for data-efficiency evaluation. Built upon 571 real human demonstrations spanning cross-room and cross-floor scenarios, λ supports both simulation and real-world transfer. We propose a lightweight, high-fidelity benchmark construction paradigm centered on human demonstrations, and a neuro-symbolic modular architecture integrating foundation models, symbolic task planning, and motion planning—evaluated against behavioral cloning and reinforcement learning baselines. Contribution/Results: Pure learning methods exhibit poor data efficiency; in contrast, our neuro-symbolic approach achieves over 40% absolute success rate improvement with only a few demonstrations, significantly enhancing robustness. λ has become the community standard for evaluating data efficiency in MoMa.

Technology Category

Application Category

📝 Abstract

Efficiently learning and executing long-horizon mobile manipulation (MoMa) tasks is crucial for advancing robotics in household and workplace settings. However, current MoMa models are data-inefficient, underscoring the need for improved models that require realistic-sized benchmarks to evaluate their efficiency, which do not exist. To address this, we introduce the LAMBDA ({lambda}) benchmark (Long-horizon Actions for Mobile-manipulation Benchmarking of Directed Activities), which evaluates the data efficiency of models on language-conditioned, long-horizon, multi-room, multi-floor, pick-and-place tasks using a dataset of manageable size, more feasible for collection. The benchmark includes 571 human-collected demonstrations that provide realism and diversity in simulated and real-world settings. Unlike planner-generated data, these trajectories offer natural variability and replay-verifiability, ensuring robust learning and evaluation. We benchmark several models, including learning-based models and a neuro-symbolic modular approach combining foundation models with task and motion planning. Learning-based models show suboptimal success rates, even when leveraging pretrained weights, underscoring significant data inefficiencies. However, the neuro-symbolic approach performs significantly better while being more data efficient. Findings highlight the need for more data-efficient learning-based MoMa approaches. {lambda} addresses this gap by serving as a key benchmark for evaluating the data efficiency of those future models in handling household robotics tasks.

Problem

Research questions and friction points this paper is trying to address.

Data Efficiency

Complex Mobile Manipulation

Language Instructions

Innovation

Methods, ideas, or system contributions that make the work stand out.

LAMBDA Test Standard

Neuro-Symbolic Methods

Data Efficiency

🔎 Similar Papers

No similar papers found.