Quality Over Quantity: Curating Contact-Based Robot Datasets Improves Learning

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the trade-off between data quality and quantity in robot learning, with particular emphasis on the critical role of contact information in dynamics and shape modeling. We propose a contact-aware Fisher information metric that incorporates pose and contact signals into the objective function to quantify the information content of individual data samples, enabling efficient data selection and refinement. In contrast to conventional paradigms relying on large-scale, low-quality datasets, our approach achieves substantial improvements in learning efficiency, stability, and generalization using only a small number of high-information samples. Our key contribution is the first explicit integration of contact awareness into the Fisher information framework, establishing a principled, interpretable, and computationally tractable criterion for evaluating data utility in contact-rich robotic learning—thereby advancing high-quality, data-driven embodied intelligence.

Technology Category

Application Category

📝 Abstract

In this paper, we investigate the utility of datasets and whether more data or the 'right' data is advantageous for robot learning. In particular, we are interested on quantifying the utility of contact-based data as contact holds significant information for robot learning. Our approach derives a contact-aware objective function for learning object dynamics and shape from pose and contact data. We show that the contact-aware Fisher-information metric can be used to rank and curate contact-data based on how informative data is for learning. In addition, we find that selecting a reduced dataset based on this ranking improves the learning task while also making learning a deterministic process. Interestingly, our results show that more data is not necessarily advantageous, and rather, less but informative data can accelerate learning, especially depending on the contact interactions. Last, we show how our metric can be used to provide initial guidance on data curation for contact-based robot learning.

Problem

Research questions and friction points this paper is trying to address.

Investigating whether more data or better quality data improves robot learning

Quantifying the utility of contact-based data for learning object dynamics

Developing contact-aware metric to rank and curate informative datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contact-aware objective function learns object dynamics

Fisher-information metric ranks contact data informativeness

Reduced informative dataset accelerates deterministic robot learning

🔎 Similar Papers

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset