CAT Pruning: Cluster-Aware Token Pruning For Text-to-Image Diffusion Models

📅 2025-02-01

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

To address the high computational cost and low inference efficiency of text-to-image diffusion models, this paper proposes a noise-relative-magnitude-aware token-level pruning and caching co-acceleration method. We innovatively design a cluster-aware token pruning mechanism that jointly integrates K-means spatial clustering with token importance estimation to dynamically retain visually salient tokens exhibiting both local consistency and semantic criticality. Furthermore, we introduce distribution-balanced sampling and cross-step token caching reuse to enhance computational efficiency. Under strict preservation of generation quality—measured by FID, CLIP Score, and other standard metrics—the method reduces inference FLOPs by 50–60%, significantly improving throughput and energy efficiency. Unlike conventional uniform or global pruning approaches, our method establishes a new paradigm for efficient diffusion model inference by enabling adaptive, semantics-guided token sparsification and intelligent reuse.

Technology Category

Application Category

📝 Abstract

Diffusion models have revolutionized generative tasks, especially in the domain of text-to-image synthesis; however, their iterative denoising process demands substantial computational resources. In this paper, we present a novel acceleration strategy that integrates token-level pruning with caching techniques to tackle this computational challenge. By employing noise relative magnitude, we identify significant token changes across denoising iterations. Additionally, we enhance token selection by incorporating spatial clustering and ensuring distributional balance. Our experiments demonstrate reveal a 50%-60% reduction in computational costs while preserving the performance of the model, thereby markedly increasing the efficiency of diffusion models. The code is available at https://github.com/ada-cheng/CAT-Pruning

Problem

Research questions and friction points this paper is trying to address.

Efficient computation

Text-to-image generation

Image quality preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

CAT Pruning Method

Text-to-Image Generation

Computational Resource Reduction

🔎 Similar Papers

No similar papers found.