Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Most existing deep reinforcement learning (DRL) approaches emphasize spatial symmetries while overlooking temporal symmetry—a critical inductive bias for robotic manipulation tasks such as door opening/closing. This work addresses this gap by systematically incorporating time-reversal symmetry into DRL. We propose TR-DRL, a novel framework featuring: (i) a dynamics-consistency filter to identify reversible trajectories; (ii) trajectory inversion augmentation; and (iii) time-reversed reward shaping to guide policy learning. Evaluated on Robosuite and MetaWorld benchmarks under both single-task and multi-task settings, TR-DRL achieves an average 2.1× improvement in sample efficiency and a 12.7% absolute gain in task success rate over state-of-the-art spatial-symmetry methods. Our results demonstrate that explicitly modeling temporal symmetry significantly enhances learning efficiency and robustness in embodied manipulation, establishing a new paradigm for time-symmetry-driven, sample-efficient robotic skill acquisition.

Technology Category

Application Category

📝 Abstract

Symmetry is pervasive in robotics and has been widely exploited to improve sample efficiency in deep reinforcement learning (DRL). However, existing approaches primarily focus on spatial symmetries, such as reflection, rotation, and translation, while largely neglecting temporal symmetries. To address this gap, we explore time reversal symmetry, a form of temporal symmetry commonly found in robotics tasks such as door opening and closing. We propose Time Reversal symmetry enhanced Deep Reinforcement Learning (TR-DRL), a framework that combines trajectory reversal augmentation and time reversal guided reward shaping to efficiently solve temporally symmetric tasks. Our method generates reversed transitions from fully reversible transitions, identified by a proposed dynamics-consistent filter, to augment the training data. For partially reversible transitions, we apply reward shaping to guide learning, according to successful trajectories from the reversed task. Extensive experiments on the Robosuite and MetaWorld benchmarks demonstrate that TR-DRL is effective in both single-task and multi-task settings, achieving higher sample efficiency and stronger final performance compared to baseline methods.

Problem

Research questions and friction points this paper is trying to address.

Explores time reversal symmetry in robotics tasks

Proposes TR-DRL for efficient temporally symmetric tasks

Enhances sample efficiency and performance in DRL

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes time reversal symmetry in DRL

Combines trajectory reversal and reward shaping

Employs dynamics-consistent filter for reversibility

🔎 Similar Papers

No similar papers found.