🤖 AI Summary
This work addresses the challenge of gradient conflict in multi-task learning, which often degrades shared representations and leads to “latent representation collapse,” thereby limiting task performance. To mitigate this issue, the authors propose an orthogonal pooling mechanism that restructures the latent space by assigning mutually orthogonal subspaces to different tasks, enabling disentangled representation learning. This approach explicitly constructs interpretable and composable orthogonal subspaces, effectively alleviating gradient interference and allowing direct manipulation of semantic concepts. Experiments on benchmark datasets—including ShapeNet, MPIIGaze, and Rotated MNIST—demonstrate that the proposed method significantly improves multi-task performance while yielding latent representations with clear structure and well-separated semantics.
📝 Abstract
Training a single network with multiple objectives often leads to conflicting gradients that degrade shared representations, forcing them into a compromised state that is suboptimal for any single task--a problem we term latent representation collapse. We introduce Domain Expansion, a framework that prevents these conflicts by restructuring the latent space itself. Our framework uses a novel orthogonal pooling mechanism to construct a latent space where each objective is assigned to a mutually orthogonal subspace. We validate our approach across diverse benchmarks--including ShapeNet, MPIIGaze, and Rotated MNIST--on challenging multi-objective problems combining classification with pose and gaze estimation. Our experiments demonstrate that this structure not only prevents collapse but also yields an explicit, interpretable, and compositional latent space where concepts can be directly manipulated.