Sparse Fine-Tuning of Transformers for Generative Tasks

📅 2025-07-14

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing fine-tuning methods rely on dense parameter updates to generate new representations, rendering model adaptation to downstream tasks largely opaque. To address this, we propose a sparse-coding-inspired Transformer fine-tuning framework: it introduces a learnable feature dictionary as the fundamental representation unit, reconstructs intermediate-layer features via sparse linear combinations, and explicitly quantifies each dictionary atom’s contribution to the task through interpretable sparse coefficients. This enables both explainable model adaptation and fine-grained concept customization. We instantiate the framework in image editing and text-to-image generation by integrating sparse coding, dictionary learning, and atomic selection mechanisms. Experimental results demonstrate substantial improvements in text–image alignment accuracy, consistently outperforming mainstream fine-tuning approaches—including LoRA and Adapter—across multiple quantitative metrics.

Technology Category

Application Category

📝 Abstract

Large pre-trained transformers have revolutionized artificial intelligence across various domains, and fine-tuning remains the dominant approach for adapting these models to downstream tasks due to the cost of training from scratch. However, in existing fine-tuning methods, the updated representations are formed as a dense combination of modified parameters, making it challenging to interpret their contributions and understand how the model adapts to new tasks. In this work, we introduce a fine-tuning framework inspired by sparse coding, where fine-tuned features are represented as a sparse combination of basic elements, i.e., feature dictionary atoms. The feature dictionary atoms function as fundamental building blocks of the representation, and tuning atoms allows for seamless adaptation to downstream tasks. Sparse coefficients then serve as indicators of atom importance, identifying the contribution of each atom to the updated representation. Leveraging the atom selection capability of sparse coefficients, we first demonstrate that our method enhances image editing performance by improving text alignment through the removal of unimportant feature dictionary atoms. Additionally, we validate the effectiveness of our approach in the text-to-image concept customization task, where our method efficiently constructs the target concept using a sparse combination of feature dictionary atoms, outperforming various baseline fine-tuning methods.

Problem

Research questions and friction points this paper is trying to address.

Improves interpretability of fine-tuned transformer representations

Enhances image editing via sparse feature dictionary atom removal

Optimizes text-to-image customization with sparse feature combinations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse fine-tuning with feature dictionary atoms

Atom selection enhances text alignment editing

Sparse combination improves concept customization

🔎 Similar Papers

No similar papers found.