🤖 AI Summary
Auxiliary tasks in multi-task learning typically rely on costly, manually designed objectives lacking theoretical grounding. Method: We propose Detaux—a framework that leverages weakly supervised representation disentanglement to automatically discover semantically independent yet structurally beneficial auxiliary classification tasks with high class separability, enabling seamless transition from single-task to multi-task learning. It constructs auxiliary tasks via orthogonal subspace decomposition and projection-based clustering—requiring neither human annotations nor meta-learning—and jointly optimizes primary and auxiliary objectives. Contribution/Results: Detaux establishes, for the first time, a theoretical connection between disentangled representation learning and multi-task learning. Experiments on synthetic and real-world benchmarks demonstrate significant improvements in generalization, especially under few-shot and complex primary task settings. Ablation studies confirm a positive correlation between disentanglement quality and auxiliary task efficacy.
📝 Abstract
Auxiliary tasks facilitate learning in situations when data is scarce or the principal task of focus is extremely complex. This idea is primarily inspired by the improved generalization capability induced by solving multiple tasks simultaneously, which leads to a more robust shared representation. Nevertheless, finding optimal auxiliary tasks is a crucial problem that often requires hand-crafted solutions or expensive meta-learning approaches. In this paper, we propose a novel framework, dubbed Detaux, whereby a weakly supervised disentanglement procedure is used to discover a new unrelated auxiliary classification task, which allows us to go from a Single-Task Learning (STL) to a Multi-Task Learning (MTL) problem. The disentanglement procedure works at the representation level, isolating the variation related to the principal task into an isolated subspace and additionally producing an arbitrary number of orthogonal subspaces, each one of them encouraging high separability among the projections. We generate the auxiliary classification task through a clustering procedure on the most disentangled subspace, obtaining a discrete set of labels. Subsequently, the original data, the labels associated with the principal task, and the newly discovered ones can be fed into any MTL framework. Experimental validation on both synthetic and real data, along with various ablation studies, demonstrate promising results, revealing the potential in what has been, so far, an unexplored connection between learning disentangled representations and MTL. The source code will be made available upon acceptance.