🤖 AI Summary
To address the high cost of manual annotation for auxiliary tasks in auxiliary learning and the computational inefficiency of existing meta-learning approaches relying on expensive bilevel optimization, this paper proposes the first reinforcement learning (RL)-based framework for dynamic auxiliary task generation. Our method jointly models auxiliary label selection and sample-level loss weighting as a sequential decision-making problem, wherein an RL agent generates optimal auxiliary supervision signals on a per-sample basis—fully eliminating the need for bilevel optimization. Evaluated on CIFAR-100 (20-superclass), our model achieves 80.9% test accuracy, substantially outperforming both handcrafted auxiliary tasks (75.53%) and state-of-the-art bilevel optimization methods. The core contribution lies in the first end-to-end, sample-wise dynamic generation of auxiliary tasks—delivering superior efficiency, scalability, and generalization capability without sacrificing performance.
📝 Abstract
Auxiliary Learning (AL) is a special case of Multi-task Learning (MTL) in which a network trains on auxiliary tasks to improve performance on its main task. This technique is used to improve generalization and, ultimately, performance on the network's main task. AL has been demonstrated to improve performance across multiple domains, including navigation, image classification, and natural language processing. One weakness of AL is the need for labeled auxiliary tasks, which can require human effort and domain expertise to generate. Meta Learning techniques have been used to solve this issue by learning an additional auxiliary task generation network that can create helpful tasks for the primary network. The most prominent techniques rely on Bi-Level Optimization, which incurs computational cost and increased code complexity. To avoid the need for Bi-Level Optimization, we present an RL-based approach to dynamically create auxiliary tasks. In this framework, an RL agent is tasked with selecting auxiliary labels for every data point in a training set. The agent is rewarded when their selection improves the performance on the primary task. We also experiment with learning optimal strategies for weighing the auxiliary loss per data point. On the 20-Superclass CIFAR100 problem, our RL approach outperforms human-labeled auxiliary tasks and performs as well as a prominent Bi-Level Optimization technique. Our weight learning approaches significantly outperform all of these benchmarks. For example, a Weight-Aware RL-based approach helps the VGG16 architecture achieve 80.9% test accuracy while the human-labeled auxiliary task setup achieved 75.53%. The goal of this work is to (1) prove that RL is a viable approach to dynamically generate auxiliary tasks and (2) demonstrate that per-sample auxiliary task weights can be learned alongside the auxiliary task labels and can achieve strong results.