Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression

📅 2026-03-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing parameter reparameterization methods are often confined to a single objective—either parameter-efficient fine-tuning or model compression—making it challenging to simultaneously address both demands under resource constraints. This work proposes CRISP, a unified framework that jointly achieves model compression and parameter-efficient fine-tuning within a single architecture. CRISP decomposes pre-trained weights into shared base matrices and lightweight mixture coefficients, enhanced by cross-layer base sharing and an interpolation-based gated coefficient recombination mechanism. Requiring fewer than 200 trainable parameters, CRISP outperforms existing approaches by 1% in joint compression and fine-tuning tasks, surpasses state-of-the-art methods by up to 1.5% in pure parameter-efficient fine-tuning, and achieves a consistent 4–5% improvement in overall dual-task performance.
📝 Abstract
Parameter Recombination (PR) methods aim to efficiently compose the weights of a neural network for applications like Parameter-Efficient FineTuning (PEFT) and Model Compression (MC), among others. Most methods typically focus on one application of PR, which can make composing them challenging. For example, when deploying a large model you may wish to compress the model and also quickly adapt to new settings. However, PEFT methods often can still contain millions of parameters. This may be small compared to the original model size, but can be problematic in resource constrained deployments like edge devices, where they take a larger portion of the compressed model's parameters. To address this, we present Coefficient-gated weight Recombination by Interpolated Shared basis Projections (CRISP), a general approach that seamlessly integrates multiple PR tasks within the same framework. CRISP accomplishes this by factorizing pretrained weights into basis matrices and their component mixing projections. Sharing basis matrices across layers and adjusting its size enables us to perform MC, whereas the mixer weight's small size (fewer than 200 in some experiments) enables CRISP to support PEFT. Experiments show CRISP outperforms methods from prior work capable of dual-task applications by 4-5\% while also outperforming the state-of-the-art in PEFT by 1.5\% and PEFT+MC combinations by 1\%. Our code is available on the repository: https://github.com/appledora/CRISP-CVPR26.
Problem

Research questions and friction points this paper is trying to address.

Parameter Recombination
Parameter-Efficient Fine-Tuning
Model Compression
Neural Network Compression
Edge Deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter Recombination
Model Compression
Parameter-Efficient Fine-Tuning
Shared Basis Factorization
Neural Network Compression
🔎 Similar Papers
No similar papers found.