🤖 AI Summary
In multi-task learning (MTL), balancing positive and negative transfer across tasks remains challenging, limiting generalization performance and representation sharing efficiency. Method: We propose the first theoretically grounded, assumption-light task grouping framework—formulated as a mathematical program that jointly models task similarity, differentiable transfer gain estimation, and theory-guided clustering optimization under resource constraints. Unlike heuristic approaches, it enables flexible modeling and end-to-end training. Contribution/Results: Evaluated across diverse benchmarks—including vision, time-series, and combinatorial optimization domains—our method consistently outperforms state-of-the-art MTL baselines in both accuracy and computational efficiency. It provides a principled, interpretable, and verifiable paradigm for transfer regulation in MTL, advancing theoretical rigor while maintaining practical scalability.
📝 Abstract
Multi-task learning (MTL) aims to leverage shared information among tasks to improve learning efficiency and accuracy. However, MTL often struggles to effectively manage positive and negative transfer between tasks, which can hinder performance improvements. Task grouping addresses this challenge by organizing tasks into meaningful clusters, maximizing beneficial transfer while minimizing detrimental interactions. This paper introduces a principled approach to task grouping in MTL, advancing beyond existing methods by addressing key theoretical and practical limitations. Unlike prior studies, our method offers a theoretically grounded approach that does not depend on restrictive assumptions for constructing transfer gains. We also present a flexible mathematical programming formulation that accommodates a wide range of resource constraints, thereby enhancing its versatility. Experimental results across diverse domains, including computer vision datasets, combinatorial optimization benchmarks, and time series tasks, demonstrate the superiority of our method over extensive baselines, thereby validating its effectiveness and general applicability in MTL without sacrificing efficiency.