T-LoRA: Single Image Diffusion Model Customization Without Overfitting

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address overfitting, poor generalization, and limited generation diversity in single-image fine-tuning of diffusion models, this paper proposes a timestep-dependent Low-Rank Adaptation (LoRA) framework. Our method dynamically modulates the rank constraint across timesteps and incorporates orthogonal initialization to ensure parameter independence of the adapters, effectively mitigating overfitting—particularly at high timesteps. Compared to standard LoRA and other personalization approaches, our method achieves a superior trade-off between concept fidelity and text–image alignment. Empirically, it maintains strong generalization and high-fidelity generation even under extreme data scarcity (e.g., only one input image), while remaining computationally efficient. The framework is thus well-suited for real-world deployment where both training data and computational resources are severely constrained.

Technology Category

Application Category

📝 Abstract
While diffusion model fine-tuning offers a powerful approach for customizing pre-trained models to generate specific objects, it frequently suffers from overfitting when training samples are limited, compromising both generalization capability and output diversity. This paper tackles the challenging yet most impactful task of adapting a diffusion model using just a single concept image, as single-image customization holds the greatest practical potential. We introduce T-LoRA, a Timestep-Dependent Low-Rank Adaptation framework specifically designed for diffusion model personalization. In our work we show that higher diffusion timesteps are more prone to overfitting than lower ones, necessitating a timestep-sensitive fine-tuning strategy. T-LoRA incorporates two key innovations: (1) a dynamic fine-tuning strategy that adjusts rank-constrained updates based on diffusion timesteps, and (2) a weight parametrization technique that ensures independence between adapter components through orthogonal initialization. Extensive experiments show that T-LoRA and its individual components outperform standard LoRA and other diffusion model personalization techniques. They achieve a superior balance between concept fidelity and text alignment, highlighting the potential of T-LoRA in data-limited and resource-constrained scenarios. Code is available at https://github.com/ControlGenAI/T-LoRA.
Problem

Research questions and friction points this paper is trying to address.

Prevent overfitting in single-image diffusion model customization
Balance concept fidelity and text alignment in fine-tuning
Optimize diffusion model adaptation for limited training data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Timestep-Dependent Low-Rank Adaptation framework
Dynamic fine-tuning strategy adjusts rank-constrained updates
Weight parametrization ensures orthogonal initialization independence
🔎 Similar Papers
No similar papers found.