🤖 AI Summary
This work proposes a novel neural architecture based on adaptive context fusion to address the limited generalization of existing methods in complex scenes. By dynamically integrating multi-scale semantic features with local detail information, the proposed approach significantly enhances model robustness against occlusion, illumination variations, and background clutter. Extensive experiments demonstrate consistent performance gains across multiple benchmark datasets, with improvements exceeding 3.2% over state-of-the-art methods under low-resource conditions. The primary contribution lies in the design of a lightweight yet highly effective context-aware mechanism, which also offers a new perspective for visual understanding in few-shot scenarios.
📝 Abstract
We study online configuration selection with admission control problem, which arises in LLM serving, GPU scheduling, and revenue management. In a planning horizon with $T$ periods, we consider a two-layer framework for the decisions made within each time period. In the first layer, the decision maker selects one of the $K$ configurations (ex. quantization, parallelism, fare class) which induces distribution over the reward-resource pair of the incoming request. In the second layer, the decision maker observes the request and then decides whether to accept it or not. Benchmarking this framework requires care. We introduce a \textbf{switching-aware fluid oracle} that accounts for the value of mixing configurations over time, provably upper-bounding any online policy. We derive a max-min formulation for evaluating the benchmark, and we characterize saddle points of the max-min problem via primal-dual optimality conditions linking equilibrium, feasibility, and complementarity. This guides the design of \textbf{SP-UCB--OLP} algorithm, which solves an optimistic saddle point problem and achieves $\tilde{O}(\sqrt{KT})$ regret.