🤖 AI Summary
Multi-class classification on resource-constrained mobile devices faces challenges in balancing accuracy, on-device computation, and communication overhead. Method: This paper proposes a client-server collaborative hybrid inference framework. It extends the Learning to Help (L2H) paradigm to multi-class settings for the first time; designs a differentiable, convex, and Bayes-consistent stage-switching surrogate loss to overcome the non-optimizability of the 0–1 loss; and introduces a dynamic offloading mechanism based on reject learning, where a client-side rejector adaptively decides whether to forward each sample to the server. Contribution/Results: Theoretically, the optimal decision rule is derived via Bayesian risk minimization. Experiments demonstrate that the method significantly outperforms existing Learning to Delegate (L2D) approaches and single-model baselines under realistic constraints—such as high communication cost and low server availability—achieving state-of-the-art trade-offs between accuracy and computational/communication overhead.
📝 Abstract
Deploying complex machine learning models on resource-constrained devices is challenging due to limited computational power, memory, and model retrainability. To address these limitations, a hybrid system can be established by augmenting the local model with a server-side model, where samples are selectively deferred by a rejector and then sent to the server for processing. The hybrid system enables efficient use of computational resources while minimizing the overhead associated with server usage. The recently proposed Learning to Help (L2H) model trains a server model given a fixed local (client) model, differing from the Learning to Defer (L2D) framework, which trains the client for a fixed (expert) server. In both L2D and L2H, the training includes learning a rejector at the client to determine when to query the server. In this work, we extend the L2H model from binary to multi-class classification problems and demonstrate its applicability in a number of different scenarios of practical interest in which access to the server may be limited by cost, availability, or policy. We derive a stage-switching surrogate loss function that is differentiable, convex, and consistent with the Bayes rule corresponding to the 0-1 loss for the L2H model. Experiments show that our proposed methods offer an efficient and practical solution for multi-class classification in resource-constrained environments.