Flexible Multitask Learning with Factorized Diffusion Policy

📅 2025-12-26

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

Robotic multi-task learning faces challenges including highly multimodal action distributions, insufficient representational capacity of monolithic models, and poor adaptability. To address these, we propose a modular diffusion policy framework featuring a novel factorized-diffusion architecture that decomposes complex action distributions into composable, addable, and removable specialized diffusion submodules. This design enables disentangled modeling of behavioral submodalities and supports incremental fine-tuning of individual components. Critically, it achieves zero-forgetting task expansion and efficient sim-to-real transfer. Experiments demonstrate that our method consistently outperforms both strong modular and monolithic baselines in both simulated and real-robot manipulation tasks, yielding substantial improvements in cross-task generalization and adaptation efficiency.

Technology Category

Application Category

📝 Abstract

Multitask learning poses significant challenges due to the highly multimodal and diverse nature of robot action distributions. However, effectively fitting policies to these complex task distributions is often difficult, and existing monolithic models often underfit the action distribution and lack the flexibility required for efficient adaptation. We introduce a novel modular diffusion policy framework that factorizes complex action distributions into a composition of specialized diffusion models, each capturing a distinct sub-mode of the behavior space for a more effective overall policy. In addition, this modular structure enables flexible policy adaptation to new tasks by adding or fine-tuning components, which inherently mitigates catastrophic forgetting. Empirically, across both simulation and real-world robotic manipulation settings, we illustrate how our method consistently outperforms strong modular and monolithic baselines.

Problem

Research questions and friction points this paper is trying to address.

Multitask learning struggles with multimodal robot action distributions

Existing monolithic models underfit actions and lack adaptation flexibility

Catastrophic forgetting hinders efficient policy adaptation to new tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular diffusion policy factorizes action distributions

Specialized diffusion models capture distinct behavior sub-modes

Flexible adaptation by adding or fine-tuning components

🔎 Similar Papers

Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner