🤖 AI Summary
In multi-task learning, gradient conflicts among tasks often cause optimization imbalance and performance degradation. This paper proposes ConicGrad, the first method to introduce dynamic conical constraints in angular space: it restricts each task’s gradient update direction within a cone centered on the overall objective gradient, enabling inter-task coordination while preserving parameter expressivity. ConicGrad integrates reference-gradient guidance, cosine-angle-constrained projection, a differentiable conical projection operator, and adaptive task weight estimation. It provides theoretical convergence guarantees and naturally supports high-dimensional parameter spaces and heterogeneous task combinations. Evaluated on multi-task supervised and reinforcement learning benchmarks, ConicGrad consistently outperforms baselines—including MOO, PCGrad, and GradNorm—achieving an average 3.2% performance gain and 37% improvement in training stability. Moreover, it scales linearly to over 100 tasks.
📝 Abstract
Balancing competing objectives remains a fundamental challenge in multi-task learning (MTL), primarily due to conflicting gradients across individual tasks. A common solution relies on computing a dynamic gradient update vector that balances competing tasks as optimization progresses. Building on this idea, we propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a constrained optimization problem. Our method introduces an angular constraint to dynamically regulate gradient update directions, confining them within a cone centered on the reference gradient of the overall objective. By balancing task-specific gradients without over-constraining their direction or magnitude, ConicGrad effectively resolves inter-task gradient conflicts. Moreover, our framework ensures computational efficiency and scalability to high-dimensional parameter spaces. We conduct extensive experiments on standard supervised learning and reinforcement learning MTL benchmarks, and demonstrate that ConicGrad achieves state-of-the-art performance across diverse tasks.