Selective Task Group Updates for Multi-Task Optimization

πŸ“… 2025-02-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In multi-task learning, parameter sharing often induces negative transfer; existing gradient- or loss-weighting methods focus on balancing shared layers but neglect task-specific parameter optimization. This paper proposes a β€œbatch-wise dynamic task grouping and selective update” paradigm that preserves shared representations while enhancing task-specific learning. Our key contributions are: (1) the first proximal inter-task affinity metric, computed in real time during optimization to quantify task relatedness; (2) theoretical proof that sequential grouped updates significantly improve convergence of task-specific parameters; and (3) an adaptive grouping algorithm coupled with a group-wise gradient update mechanism. Extensive experiments demonstrate substantial improvements over state-of-the-art methods across multiple benchmarks. The approach exhibits strong generalization across diverse architectures and task scales, as well as excellent scalability to large numbers of tasks.

Technology Category

Application Category

πŸ“ Abstract
Multi-task learning enables the acquisition of task-generic knowledge by training multiple tasks within a unified architecture. However, training all tasks together in a single architecture can lead to performance degradation, known as negative transfer, which is a main concern in multi-task learning. Previous works have addressed this issue by optimizing the multi-task network through gradient manipulation or weighted loss adjustments. However, their optimization strategy focuses on addressing task imbalance in shared parameters, neglecting the learning of task-specific parameters. As a result, they show limitations in mitigating negative transfer, since the learning of shared space and task-specific information influences each other during optimization. To address this, we propose a different approach to enhance multi-task performance by selectively grouping tasks and updating them for each batch during optimization. We introduce an algorithm that adaptively determines how to effectively group tasks and update them during the learning process. To track inter-task relations and optimize multi-task networks simultaneously, we propose proximal inter-task affinity, which can be measured during the optimization process. We provide a theoretical analysis on how dividing tasks into multiple groups and updating them sequentially significantly affects multi-task performance by enhancing the learning of task-specific parameters. Our methods substantially outperform previous multi-task optimization approaches and are scalable to different architectures and various numbers of tasks.
Problem

Research questions and friction points this paper is trying to address.

Mitigates negative transfer in multi-task learning
Enhances task-specific parameter learning
Adaptively groups and updates tasks during optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Selective task grouping
Proximal inter-task affinity
Sequential task updates
πŸ”Ž Similar Papers
No similar papers found.