TSN-Affinity: Similarity-Driven Parameter Reuse for Continual Offline Reinforcement Learning

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses the challenges of catastrophic forgetting and the high memory overhead and distributional mismatch associated with replay mechanisms in continual offline reinforcement learning. It proposes a replay-free, similarity-guided architecture reuse method that integrates Decision Transformers with Tiny Subnetworks. By leveraging action compatibility and latent representation similarity, the approach enables task-specific parameterization and controlled knowledge sharing across tasks, complemented by a task-routing mechanism that dynamically activates relevant subnetworks. As the first architecture reuse framework tailored for continual offline reinforcement learning, it demonstrates significant improvements in multi-task performance on both Atari games and Franka robot arm manipulation tasks while effectively preserving knowledge from previously learned tasks.

📝 Abstract

Continual offline reinforcement learning (CORL) aims to learn a sequence of tasks from datasets collected over time while preserving performance on previously learned tasks. This setting corresponds to domains where new tasks arise over time, but adapting the model in live environment interactions is expensive, risky, or impossible. However, CORL inherits the dual difficulty of offline reinforcement learning and adapting while preventing catastrophic forgetting. Replay-based continual learning approaches remain a strong baseline but incur memory overhead and suffer from a distribution mismatch between replayed samples and newly learned policies. At the same time, architectural continual learning methods have shown strong potential in supervised learning but remain underexplored in CORL. In this work, we propose TSN-Affinity, a novel CORL method based on TinySubNetworks and Decision Transformer. The method enables task-specific parameterization and controlled knowledge sharing through a RL-aware reuse strategy that routes tasks according to action compatibility and latent similarity. We evaluate the approach on benchmarks based on Atari games and simulations of manipulation tasks with the Franka Emika Panda robotic arm, covering both discrete and continuous control. Results show strong retention from sparse SubNetworks, with routing further improving multi-task performance. Our findings suggest that similarity-guided architectural reuse is a strong and viable alternative to replay-based strategies in a CORL setting. Our code is available at: https://github.com/anonymized-for-submission123/tsn-affinity.

Problem

Research questions and friction points this paper is trying to address.

Continual Offline Reinforcement Learning

Catastrophic Forgetting

Task Sequence Learning

Offline RL

Model Adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continual Offline Reinforcement Learning

Parameter Reuse

TinySubNetworks