ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models

📅 2024-03-04
🏛️ arXiv.org
📈 Citations: 10
Influential: 0
📄 PDF
🤖 AI Summary
Current text-to-image models (e.g., Stable Diffusion) and personalization methods (e.g., DreamBooth, LoRA) suffer from severe quality degradation at non-training resolutions and lack native support for arbitrary aspect ratios. To address this, we propose ResAdapter—a lightweight (0.5M parameters), plug-and-play resolution adapter that relies solely on pure resolution priors and requires no backbone fine-tuning or post-processing. Designed in accordance with diffusion model architecture, ResAdapter preserves the original personalized style domain while enabling end-to-end generation at arbitrary resolutions and aspect ratios. It is fully compatible with mainstream frameworks including DreamBooth, LoRA, ControlNet, IP-Adapter, and ElasticDiffusion. Extensive experiments demonstrate that ResAdapter significantly improves generation quality, inference efficiency, and cross-resolution consistency—particularly at high resolutions—without compromising fidelity or personalization capability.

Technology Category

Application Category

📝 Abstract
Recent advancement in text-to-image models (e.g., Stable Diffusion) and corresponding personalized technologies (e.g., DreamBooth and LoRA) enables individuals to generate high-quality and imaginative images. However, they often suffer from limitations when generating images with resolutions outside of their trained domain. To overcome this limitation, we present the Resolution Adapter (ResAdapter), a domain-consistent adapter designed for diffusion models to generate images with unrestricted resolutions and aspect ratios. Unlike other multi-resolution generation methods that process images of static resolution with complex post-process operations, ResAdapter directly generates images with the dynamical resolution. Especially, after learning a deep understanding of pure resolution priors, ResAdapter trained on the general dataset, generates resolution-free images with personalized diffusion models while preserving their original style domain. Comprehensive experiments demonstrate that ResAdapter with only 0.5M can process images with flexible resolutions for arbitrary diffusion models. More extended experiments demonstrate that ResAdapter is compatible with other modules (e.g., ControlNet, IP-Adapter and LCM-LoRA) for image generation across a broad range of resolutions, and can be integrated into other multi-resolution model (e.g., ElasticDiffusion) for efficiently generating higher-resolution images. Project link is https://res-adapter.github.io
Problem

Research questions and friction points this paper is trying to address.

Generates images with unrestricted resolutions and aspect ratios
Preserves original style domain in personalized diffusion models
Compatible with various modules for broad resolution image generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

ResAdapter enables unrestricted resolution image generation.
Learns resolution priors for domain-consistent image generation.
Compatible with various modules for multi-resolution image generation.
🔎 Similar Papers
No similar papers found.
J
Jiaxiang Cheng
ByteDance Inc
Pan Xie
Pan Xie
Bytedance
multimodal generation
X
Xin Xia
ByteDance Inc
Jiashi Li
Jiashi Li
ByteDance Inc
Image/Video GenerationTrain/Infer Infra
J
Jie Wu
ByteDance Inc
Y
Yuxi Ren
ByteDance Inc
H
Huixia Li
ByteDance Inc
Xuefeng Xiao
Xuefeng Xiao
ByteDance Seed
Computer VisionEfficient AI
M
Min Zheng
ByteDance Inc
L
Lean Fu
ByteDance Inc