StableMTL: Repurposing Latent Diffusion Models for Multi-Task Learning from Partially Annotated Synthetic Datasets

📅 2025-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the prohibitively high annotation cost of full-task supervision in multi-task dense prediction, this paper introduces the first zero-shot multi-task learning paradigm: a general-purpose model is trained solely on multiple partially annotated synthetic datasets, each covering only a subset of tasks. Methodologically, we replace conventional weighted loss functions with a unified latent-space loss; propose a multi-stream architecture coupled with a 1-to-N task attention mechanism to enable efficient cross-task collaboration; and develop a latent-variable regression framework built upon Stable Diffusion, integrating task encoding, conditional diffusion modeling, and customized training strategies. Evaluated across eight benchmarks and seven dense prediction task categories, our approach consistently surpasses state-of-the-art methods, demonstrating strong scalability and generalization to unseen task combinations.

Technology Category

Application Category

📝 Abstract
Multi-task learning for dense prediction is limited by the need for extensive annotation for every task, though recent works have explored training with partial task labels. Leveraging the generalization power of diffusion models, we extend the partial learning setup to a zero-shot setting, training a multi-task model on multiple synthetic datasets, each labeled for only a subset of tasks. Our method, StableMTL, repurposes image generators for latent regression. Adapting a denoising framework with task encoding, per-task conditioning and a tailored training scheme. Instead of per-task losses requiring careful balancing, a unified latent loss is adopted, enabling seamless scaling to more tasks. To encourage inter-task synergy, we introduce a multi-stream model with a task-attention mechanism that converts N-to-N task interactions into efficient 1-to-N attention, promoting effective cross-task sharing. StableMTL outperforms baselines on 7 tasks across 8 benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Leveraging diffusion models for zero-shot multi-task learning
Training multi-task models on partially labeled synthetic datasets
Enhancing inter-task synergy with task-attention mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Repurposing latent diffusion models for regression
Unified latent loss replaces per-task balancing
Task-attention mechanism enhances cross-task sharing
🔎 Similar Papers