🤖 AI Summary
This paper addresses the challenge of intelligent rejection—i.e., abstaining from prediction when uncertain—in tightly coupled multi-task learning scenarios (e.g., object detection, EHR analysis), where classification and regression tasks are strongly interdependent. We propose the first two-stage learning framework for rejection in such settings. Methodologically, we extend two-stage rejection to multi-task learning for the first time; design a novel family of surrogate losses that simultaneously satisfy (𝒢,ℛ)-consistency and Bayes consistency; and introduce a rejection mechanism jointly driven by cross-entropy and L1 costs, with a theoretically grounded convergence bound accommodating heterogeneous expert-dependent costs. Theoretical analysis and experiments on object detection and electronic health record tasks demonstrate significant improvements over state-of-the-art methods, validating the framework’s effectiveness in strongly coupled multi-task settings and revealing the systemic suboptimality of existing single-stage rejection approaches.
📝 Abstract
The Two-Stage Learning-to-Defer framework has been extensively studied for classification and, more recently, regression tasks. However, many contemporary applications involve both classification and regression in an interdependent manner. In this work, we introduce a novel Two-Stage Learning-to-Defer framework for multi-task learning that jointly addresses these tasks. Our approach leverages a two-stage surrogate loss family, which we prove to be both ($mathcal{G}, mathcal{R}$)-consistent and Bayes-consistent, providing strong theoretical guarantees of convergence to the Bayes-optimal rejector. We establish consistency bounds explicitly linked to the cross-entropy surrogate family and the $L_1$-norm of the agents' costs, extending the theoretical minimizability gap analysis to the two-stage setting with multiple experts. We validate our framework on two challenging tasks: object detection, where classification and regression are tightly coupled, and existing methods fail, and electronic health record analysis, in which we highlight the suboptimality of current learning-to-defer approaches.