APT: Improving Diffusion Models for High Resolution Image Generation with Adaptive Path Tracing

📅 2025-07-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models face two key challenges in high-resolution image generation: (1) degraded generalization due to fixed-resolution training, and (2) patch-level distribution shift and increased monotonicity induced by patch-wise, training-free inference methods—both compromising detail fidelity and sampling efficiency. To address these issues without additional training, we propose an Adaptive Path Tracing framework that jointly integrates statistical matching and scale-aware scheduling within the latent space, dynamically calibrating patch-wise distributions and optimizing denoising trajectories. Coupled with latent diffusion models and adaptive patch fusion, our method significantly enhances texture sharpness and structural coherence in high-resolution outputs while accelerating sampling. Extensive experiments demonstrate superior performance over existing training-free approaches in standard metrics—including lower FID and LPIPS—achieving a favorable balance between generation quality and inference efficiency.

Technology Category

Application Category

📝 Abstract
Latent Diffusion Models (LDMs) are generally trained at fixed resolutions, limiting their capability when scaling up to high-resolution images. While training-based approaches address this limitation by training on high-resolution datasets, they require large amounts of data and considerable computational resources, making them less practical. Consequently, training-free methods, particularly patch-based approaches, have become a popular alternative. These methods divide an image into patches and fuse the denoising paths of each patch, showing strong performance on high-resolution generation. However, we observe two critical issues for patch-based approaches, which we call ``patch-level distribution shift" and ``increased patch monotonicity." To address these issues, we propose Adaptive Path Tracing (APT), a framework that combines Statistical Matching to ensure patch distributions remain consistent in upsampled latents and Scale-aware Scheduling to deal with the patch monotonicity. As a result, APT produces clearer and more refined details in high-resolution images. In addition, APT enables a shortcut denoising process, resulting in faster sampling with minimal quality degradation. Our experimental results confirm that APT produces more detailed outputs with improved inference speed, providing a practical approach to high-resolution image generation.
Problem

Research questions and friction points this paper is trying to address.

LDMs struggle with high-resolution images due to fixed training resolutions
Patch-based methods face patch-level distribution shift and monotonicity issues
APT improves high-resolution generation with adaptive path tracing techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Path Tracing for high-res images
Statistical Matching for consistent patch distributions
Scale-aware Scheduling reduces patch monotonicity
🔎 Similar Papers
No similar papers found.