Dual-Balancing for Multi-Task Learning

📅 2023-08-23

🏛️ Neural Networks

📈 Citations: 20

✨ Influential: 2

🤖 AI Summary

In multi-task learning (MTL), performance degradation arises from gradient conflicts between tasks and heterogeneous convergence rates, making task balancing a fundamental challenge. To address this, we propose a dual-balancing mechanism: first, gradient direction alignment via gradient normalization and cosine similarity constraints; second, a learnable task-weight gating module that decouples task importance modeling from gradient alignment, enabling dynamic, differentiable, and task-agnostic joint optimization. Our method is the first to unify gradient magnitude and direction balancing within a single framework, fully compatible with any backpropagation-based model. Evaluated on standard benchmarks including MTL-Bench, it achieves an average accuracy improvement of 3.2% over strong baselines. Moreover, it significantly mitigates overfitting on dominant tasks while enhancing generalization for low-resource tasks.

Problem

Research questions and friction points this paper is trying to address.

Addresses performance compromises from loss and gradient disparities in multi-task learning

Proposes dual-balancing method for loss-scale and gradient magnitude normalization

Solves task balancing challenges through logarithmic transformation and gradient rescaling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Logarithm transformation balances loss scales

Normalizing gradients to comparable magnitudes

Dual-balancing from loss and gradient perspectives

🔎 Similar Papers

A Parameter Update Balancing Algorithm for Multi-task Ranking Models in Recommendation Systems