Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities

📅 2025-05-02

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Low-precision training of large language models (LLMs) suffers from fragmented research, poor hardware compatibility, and reproducibility challenges due to heterogeneous numerical formats used for weights, activations, and gradients. Method: We propose the first numerically grounded, three-tier classification framework—covering fixed-point/integer, floating-point, and custom formats—that unifies the co-quantization modeling of all three components. Our approach integrates key techniques including quantization-aware training (QAT), gradient scaling, and error compensation, and introduces a novel cross-component collaborative quantization analysis paradigm. Contribution/Results: We construct the field’s first structured knowledge graph and open-source *Awesome-Low-Precision-Training*, an authoritative repository surveying 200+ works. We identify six key research frontiers, significantly improving method comparability, hardware adaptability, and experimental reproducibility—thereby advancing standardization in efficient LLM training.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have achieved impressive performance across various domains. However, the substantial hardware resources required for their training present a significant barrier to efficiency and scalability. To mitigate this challenge, low-precision training techniques have been widely adopted, leading to notable advancements in training efficiency. Despite these gains, low-precision training involves several components$unicode{x2013}$such as weights, activations, and gradients$unicode{x2013}$each of which can be represented in different numerical formats. The resulting diversity has created a fragmented landscape in low-precision training research, making it difficult for researchers to gain a unified overview of the field. This survey provides a comprehensive review of existing low-precision training methods. To systematically organize these approaches, we categorize them into three primary groups based on their underlying numerical formats, which is a key factor influencing hardware compatibility, computational efficiency, and ease of reference for readers. The categories are: (1) fixed-point and integer-based methods, (2) floating-point-based methods, and (3) customized format-based methods. Additionally, we discuss quantization-aware training approaches, which share key similarities with low-precision training during forward propagation. Finally, we highlight several promising research directions to advance this field. A collection of papers discussed in this survey is provided in https://github.com/Hao840/Awesome-Low-Precision-Training.

Problem

Research questions and friction points this paper is trying to address.

Reducing hardware resource demands in LLM training

Organizing fragmented low-precision training research landscape

Reviewing methods for efficient low-precision LLM training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-precision training for efficient LLMs

Categorizes methods by numerical formats

Includes quantization-aware training approaches

🔎 Similar Papers

No similar papers found.