Prime Once, then Reprogram Locally: An Efficient Alternative to Black-Box Service Model Adaptation

📅 2026-04-01

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the high cost and inefficiency of existing zeroth-order optimization (ZOO)-based black-box model adaptation methods, which rely on numerous API calls and suffer significant performance degradation on modern large models like GPT-4o that are insensitive to input perturbations. To overcome these limitations, the authors propose AReS, a novel “one-shot excitation with local reprogramming” paradigm: a lightweight local encoder is activated via a single black-box API query, after which white-box reprogramming and fine-tuning proceed entirely offline, eliminating further API interactions. This approach reduces API overhead by over 99.99% while consistently outperforming state-of-the-art methods—achieving average gains of +2.5% on vision-language tasks and +15.6% on standard vision benchmarks—and yields a 27.8% improvement over zero-shot performance on GPT-4o, substantially enhancing both adaptation efficiency and stability.

Technology Category

Application Category

📝 Abstract

Adapting closed-box service models (i.e., APIs) for target tasks typically relies on reprogramming via Zeroth-Order Optimization (ZOO). However, this standard strategy is known for extensive, costly API calls and often suffers from slow, unstable optimization. Furthermore, we observe that this paradigm faces new challenges with modern APIs (e.g., GPT-4o). These models can be less sensitive to the input perturbations ZOO relies on, thereby hindering performance gains. To address these limitations, we propose an Alternative efficient Reprogramming approach for Service models (AReS). Instead of direct, continuous closed-box optimization, AReS initiates a single-pass interaction with the service API to prime an amenable local pre-trained encoder. This priming stage trains only a lightweight layer on top of the local encoder, making it highly receptive to the subsequent glass-box (white-box) reprogramming stage performed directly on the local model. Consequently, all subsequent adaptation and inference rely solely on this local proxy, eliminating all further API costs. Experiments demonstrate AReS's effectiveness where prior ZOO-based methods struggle: on GPT-4o, AReS achieves a +27.8% gain over the zero-shot baseline, a task where ZOO-based methods provide little to no improvement. Broadly, across ten diverse datasets, AReS outperforms state-of-the-art methods (+2.5% for VLMs, +15.6% for standard VMs) while reducing API calls by over 99.99%. AReS thus provides a robust and practical solution for adapting modern closed-box models.

Problem

Research questions and friction points this paper is trying to address.

closed-box service models

Zeroth-Order Optimization

API adaptation

model reprogramming

input perturbation sensitivity

Innovation

Methods, ideas, or system contributions that make the work stand out.

black-box model adaptation

zeroth-order optimization

local proxy model