🤖 AI Summary
Existing parameter-efficient fine-tuning (PEFT) methods suffer from poor reproducibility, complex deployment, and inconsistent evaluation protocols, hindering rigorous methodological comparison and systematic benchmarking. To address these challenges, we introduce the first unified, modular PEFT framework—supporting 19 mainstream techniques (e.g., LoRA, Adapter), 27 datasets, and 12 NLP task categories. It provides standardized experimental configurations, cross-model and cross-task evaluation metrics, and robust control mechanisms for reproducible training. Built upon the LLaMA-Factory architecture, the framework ensures plug-and-play usability and extensibility, and has been open-sourced and empirically validated across multiple leading large language models. Our work substantially improves the reproducibility, comparability, and engineering efficiency of PEFT methods, establishing a trustworthy, standardized benchmark platform for efficient fine-tuning research and practice.
📝 Abstract
Parameter-Efficient Fine-Tuning (PEFT) methods address the increasing size of Large Language Models (LLMs). Currently, many newly introduced PEFT methods are challenging to replicate, deploy, or compare with one another. To address this, we introduce PEFT-Factory, a unified framework for efficient fine-tuning LLMs using both off-the-shelf and custom PEFT methods. While its modular design supports extensibility, it natively provides a representative set of 19 PEFT methods, 27 classification and text generation datasets addressing 12 tasks, and both standard and PEFT-specific evaluation metrics. As a result, PEFT-Factory provides a ready-to-use, controlled, and stable environment, improving replicability and benchmarking of PEFT methods. PEFT-Factory is a downstream framework that originates from the popular LLaMA-Factory, and is publicly available at https://github.com/kinit-sk/PEFT-Factory