Unified Scaling Laws for Compressed Representations

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Existing performance prediction methods for model compression—such as sparsification, scalar/vector quantization, and their combinations—lack a unified scaling law, hindering systematic comparison and design across compression formats. Method: We propose the first general-purpose performance prediction framework applicable across diverse compression paradigms. Its core innovation is a “capacity” metric defined via Gaussian data-fitting ability, enabling robust, unified modeling of parameter efficiency and cross-format accuracy potential assessment. The method integrates theoretical analysis with empirical validation, including Gaussian fitting evaluation, composite compression modeling, and a joint sparsity-quantization training algorithm. Results: Our scaling law achieves high predictive accuracy across sparsity-only, quantization-only, and hybrid compression regimes. It significantly improves training efficiency and final model accuracy, providing both a generalizable theoretical foundation and a practical tool for compressed model design and evaluation.

Technology Category

Application Category

📝 Abstract

Scaling laws have shaped recent advances in machine learning by enabling predictable scaling of model performance based on model size, computation, and data volume. Concurrently, the rise in computational cost for AI has motivated model compression techniques, notably quantization and sparsification, which have emerged to mitigate the steep computational demands associated with large-scale training and inference. This paper investigates the interplay between scaling laws and compression formats, exploring whether a unified scaling framework can accurately predict model performance when training occurs over various compressed representations, such as sparse, scalar-quantized, sparse-quantized or even vector-quantized formats. Our key contributions include validating a general scaling law formulation and showing that it is applicable both individually but also composably across compression types. Based on this, our main finding is demonstrating both theoretically and empirically that there exists a simple"capacity"metric -- based on the representation's ability to fit random Gaussian data -- which can robustly predict parameter efficiency across multiple compressed representations. On the practical side, we extend our formulation to directly compare the accuracy potential of different compressed formats, and to derive better algorithms for training over sparse-quantized formats.

Problem

Research questions and friction points this paper is trying to address.

Explores unified scaling laws for compressed AI model representations

Investigates performance prediction across sparse and quantized formats

Proposes a capacity metric for comparing compression efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified scaling laws for compressed representations

General scaling law across compression types

Capacity metric predicts parameter efficiency

🔎 Similar Papers

Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection

2024-09-05arXiv.orgCitations: 0

MCNC: Manifold-Constrained Reparameterization for Neural Compression

2024-06-27Citations: 1