SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning

📅 2024-08-10

🏛️ AAAI Conference on Artificial Intelligence

📈 Citations: 30

✨ Influential: 1

career value

217K/year

🤖 AI Summary

To address the high barriers to fine-tuning large language models (LLMs) and multimodal LLMs (MLLMs), fragmented toolchains, and inadequate multimodal support, this work introduces the first open-source infrastructure systematically enabling lightweight fine-tuning of multimodal large models. The framework supports over 350 LLMs and 80 MLLMs via a modular architecture, dynamic model registration, unified training interfaces, and quantization-aware fine-tuning. It integrates end-to-end capabilities—including inference (e.g., vLLM, TGI), evaluation, quantization, and deployment. Its key contributions are: (1) the first standardized, production-ready support for MLLM fine-tuning; (2) the broadest model coverage and most comprehensive end-to-end toolchain available; and (3) cross-benchmark reproducible evaluation. Experiments across multiple benchmarks demonstrate substantial reductions in development overhead, while validating efficiency, compatibility, and out-of-the-box usability.

Technology Category

Application Category

📝 Abstract

Recent development in Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) have achieved superior performance and generalization capabilities, covered extensive areas of traditional tasks. However, existing large model training frameworks support only a limited number of models and techniques, particularly lacking in support for new models, which makes fine-tuning LLMs challenging for most developers. Therefore, we develop SWIFT, a customizable one-stop infrastructure for large models. With support of over 350+ LLMs and 80+ MLLMs, SWIFT stands as the open-source framework that provide the most comprehensive support for fine-tuning large models. In particular, it is the first training framework that provides systematic support for MLLMs. Moreover, SWIFT integrates post-training processes such as inference, evaluation, and quantization, to facilitate fast adoptions of large models in various application scenarios, offering helpful utilities like benchmark comparisons among different training techniques.

Problem

Research questions and friction points this paper is trying to address.

Develops SWIFT for scalable lightweight fine-tuning of LLMs and MLLMs

Provides comprehensive support for over 300 LLMs and 50 MLLMs

Integrates post-training processes like inference and model quantization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Scalable lightweight infrastructure for fine-tuning

Supports 300+ LLMs and 50+ MLLMs

Integrates inference, evaluation, and quantization

🔎 Similar Papers

No similar papers found.