Using Temperature Sampling to Effectively Train Robot Learning Policies on Imbalanced Datasets

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

In robotic multi-task learning, severe class imbalance among physical action categories degrades policy generalization. To address this, we propose a lightweight temperature-sampling strategy that dynamically adjusts action-class sampling probabilities within a supervised learning framework—without modifying model architecture or training objectives—and integrates seamlessly into standard pretraining-fine-tuning pipelines. Evaluated on both simulation and real-world settings (using a Franka Panda robot arm), our method significantly improves performance on low-resource tasks while preserving accuracy on high-resource ones, outperforming existing class-balancing approaches in overall generalization. The core contribution is an extremely low-overhead solution to action-distribution skew, achieving strong trade-offs among computational efficiency, architectural agnosticism, and practical deployability.

Technology Category

Application Category

📝 Abstract

Increasingly large datasets of robot actions and sensory observations are being collected to train ever-larger neural networks. These datasets are collected based on tasks and while these tasks may be distinct in their descriptions, many involve very similar physical action sequences (e.g., 'pick up an apple' versus 'pick up an orange'). As a result, many datasets of robotic tasks are substantially imbalanced in terms of the physical robotic actions they represent. In this work, we propose a simple sampling strategy for policy training that mitigates this imbalance. Our method requires only a few lines of code to integrate into existing codebases and improves generalization. We evaluate our method in both pre-training small models and fine-tuning large foundational models. Our results show substantial improvements on low-resource tasks compared to prior state-of-the-art methods, without degrading performance on high-resource tasks. This enables more effective use of model capacity for multi-task policies. We also further validate our approach in a real-world setup on a Franka Panda robot arm across a diverse set of tasks.

Problem

Research questions and friction points this paper is trying to address.

Addressing dataset imbalance in robot action sequences

Improving policy generalization for multi-task learning

Enhancing performance on low-resource robotic tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Temperature sampling balances imbalanced robot datasets

Method integrates easily into existing training codebases

Improves low-resource task performance without degrading others

🔎 Similar Papers

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

2024-03-19Robotics: Science and SystemsCitations: 151

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation

2024-06-20arXiv.orgCitations: 3

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Robotic Control Policy (PhD)