Memory-Efficient Personalization of Text-to-Image Diffusion Models via Selective Optimization Strategies

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the trade-off among high memory overhead, privacy risks, and degraded generation quality when personalizing text-to-image diffusion models on edge devices, this paper proposes a timestep-aware selective optimization framework. Our method dynamically couples low-resolution backpropagation (BP-low) with high-resolution zeroth-order optimization (ZO-high), adaptively switching between them based on the diffusion timestep: BP-low is employed in early timesteps to accelerate convergence, while ZO-high is activated in later timesteps to preserve fine-grained details. We introduce a novel timestep-aware probabilistic scheduling mechanism to mitigate structural distortions inherent in low-resolution training and the slow convergence of zeroth-order methods. Experiments demonstrate that our approach reduces memory consumption by 42%–68% while matching the generation quality of full-parameter fine-tuning. It enables efficient, privacy-preserving, and scalable on-device personalization under resource constraints.

Technology Category

Application Category

📝 Abstract
Memory-efficient personalization is critical for adapting text-to-image diffusion models while preserving user privacy and operating within the limited computational resources of edge devices. To this end, we propose a selective optimization framework that adaptively chooses between backpropagation on low-resolution images (BP-low) and zeroth-order optimization on high-resolution images (ZO-high), guided by the characteristics of the diffusion process. As observed in our experiments, BP-low efficiently adapts the model to target-specific features, but suffers from structural distortions due to resolution mismatch. Conversely, ZO-high refines high-resolution details with minimal memory overhead but faces slow convergence when applied without prior adaptation. By complementing both methods, our framework leverages BP-low for effective personalization while using ZO-high to maintain structural consistency, achieving memory-efficient and high-quality fine-tuning. To maximize the efficacy of both BP-low and ZO-high, we introduce a timestep-aware probabilistic function that dynamically selects the appropriate optimization strategy based on diffusion timesteps. This function mitigates the overfitting from BP-low at high timesteps, where structural information is critical, while ensuring ZO-high is applied more effectively as training progresses. Experimental results demonstrate that our method achieves competitive performance while significantly reducing memory consumption, enabling scalable, high-quality on-device personalization without increasing inference latency.
Problem

Research questions and friction points this paper is trying to address.

Memory-efficient personalization of diffusion models
Balancing low-resolution and high-resolution optimization
Dynamic strategy selection for effective fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Selective optimization framework combining BP-low and ZO-high
Timestep-aware probabilistic function for dynamic strategy selection
Memory-efficient on-device personalization without latency increase
🔎 Similar Papers
No similar papers found.