Dual-Balancing for Multi-Task Learning

📅 2023-08-23
🏛️ Neural Networks
📈 Citations: 20
Influential: 2
📄 PDF
🤖 AI Summary
In multi-task learning (MTL), performance degradation arises from gradient conflicts between tasks and heterogeneous convergence rates, making task balancing a fundamental challenge. To address this, we propose a dual-balancing mechanism: first, gradient direction alignment via gradient normalization and cosine similarity constraints; second, a learnable task-weight gating module that decouples task importance modeling from gradient alignment, enabling dynamic, differentiable, and task-agnostic joint optimization. Our method is the first to unify gradient magnitude and direction balancing within a single framework, fully compatible with any backpropagation-based model. Evaluated on standard benchmarks including MTL-Bench, it achieves an average accuracy improvement of 3.2% over strong baselines. Moreover, it significantly mitigates overfitting on dominant tasks while enhancing generalization for low-resource tasks.
Problem

Research questions and friction points this paper is trying to address.

Addresses performance compromises from loss and gradient disparities in multi-task learning
Proposes dual-balancing method for loss-scale and gradient magnitude normalization
Solves task balancing challenges through logarithmic transformation and gradient rescaling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Logarithm transformation balances loss scales
Normalizing gradients to comparable magnitudes
Dual-balancing from loss and gradient perspectives