Dual-Balancing for Multi-Task Learning

📅 2023-08-23
🏛️ Neural Networks
📈 Citations: 20
Influential: 2
📄 PDF
🤖 AI Summary
In multi-task learning (MTL), performance degradation arises from gradient conflicts between tasks and heterogeneous convergence rates, making task balancing a fundamental challenge. To address this, we propose a dual-balancing mechanism: first, gradient direction alignment via gradient normalization and cosine similarity constraints; second, a learnable task-weight gating module that decouples task importance modeling from gradient alignment, enabling dynamic, differentiable, and task-agnostic joint optimization. Our method is the first to unify gradient magnitude and direction balancing within a single framework, fully compatible with any backpropagation-based model. Evaluated on standard benchmarks including MTL-Bench, it achieves an average accuracy improvement of 3.2% over strong baselines. Moreover, it significantly mitigates overfitting on dominant tasks while enhancing generalization for low-resource tasks.
Problem

Research questions and friction points this paper is trying to address.

Addresses performance compromises from loss and gradient disparities in multi-task learning
Proposes dual-balancing method for loss-scale and gradient magnitude normalization
Solves task balancing challenges through logarithmic transformation and gradient rescaling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Logarithm transformation balances loss scales
Normalizing gradients to comparable magnitudes
Dual-balancing from loss and gradient perspectives
🔎 Similar Papers
No similar papers found.
Baijiong Lin
Baijiong Lin
Ph.D. Student, The Hong Kong University of Science and Technology (Guangzhou)
RLVRLLM Post-TrainingMulti-Task Learning
Weisen Jiang
Weisen Jiang
CUHK, HKUST
large language modelsdeep learningmeta-learning
Feiyang Ye
Feiyang Ye
University of Technology Sydney, Ph.D student
Multi-Task Learning
Y
Yu Zhang
Southern University of Science and Technology, Shenzhen, 518055, China
P
Pengguang Chen
SmartMore, Shenzhen, 518000, China
Y
Yingke Chen
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, 510000, China
S
Shu Liu
SmartMore, Shenzhen, 518000, China