Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing zeroth-order (ZO) optimizers rely on hand-crafted, static perturbation strategies that poorly adapt to large language model (LLM) architectures, resulting in high memory overhead and limited generalization during fine-tuning. To address this, we propose ZO Fine-tuner—the first learnable ZO framework that integrates meta-learning to automatically discover task-aware optimal perturbation directions via a lightweight neural network, replacing fixed sampling schemes. The framework is trained once and then reused across diverse tasks without retraining, supports mainstream LLM architectures, and incurs minimal deployment overhead. Extensive experiments across four LLMs and seven benchmark datasets demonstrate that ZO Fine-tuner significantly outperforms existing ZO optimizers on 82.1% of task–model combinations. It substantially enhances the adaptability, efficiency, and scalability of zeroth-order fine-tuning while preserving gradient-free operation.

Technology Category

Application Category

📝 Abstract
Zeroth-order optimizers have recently emerged as a practical approach for fine-tuning large language models (LLMs), significantly reducing GPU memory consumption compared to traditional first-order methods. Yet, existing zeroth-order methods rely on hand-crafted, static sampling strategies that are not adaptable to model-specific structures. To address this, we propose ZO Fine-tuner, a learning-based zeroth-order optimizer for LLMs that automatically learns efficient perturbation strategies through a compact and memory-efficient design. Crucially, our approach is motivated by the observation that only a small number of foundation models and their derivatives are widely adopted in practice. Therefore, learning the optimizer once for a given LLM and reusing it across diverse downstream tasks is both feasible and highly desirable. Accordingly, ZO Fine-tuner is designed to scale learning to learn (L2L) to the foundation-model era by supporting one-time training per LLM with minimal overhead. Experiments on 4 LLMs and 7 datasets show that ZO Fine-tuner outperforms prior zeroth-order baselines in 82.1% of task-model combinations, thereby demonstrating strong performance and scalability for efficient LLM fine-tuning. Our code is available at https://github.com/ASTRAL-Group/ZO_Fine_tuner.git.
Problem

Research questions and friction points this paper is trying to address.

Learning adaptive perturbation strategies for zeroth-order LLM fine-tuning
Replacing hand-crafted static sampling with learned optimization methods
Enabling one-time optimizer training per foundation model for multiple tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns perturbation strategies automatically for LLMs
Reuses learned optimizer across diverse downstream tasks
Supports one-time training per LLM with minimal overhead
🔎 Similar Papers
No similar papers found.
K
Kairun Zhang
University of Chicago
H
Haoyu Li
UIUC
Yanjun Zhao
Yanjun Zhao
UIUC
ml
Y
Yifan Sun
UIUC
H
Huan Zhang
UIUC