Uni-Classifier: Leveraging Video Diffusion Priors for Universal Guidance Classifier

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses the degradation in generation quality commonly caused by distribution mismatches between upstream and downstream models in generative AI pipelines. To this end, the authors propose Uni-Classifier, a lightweight, plug-and-play module that, for the first time, integrates video diffusion priors into a general-purpose guidance classifier. By injecting downstream task-aware prior knowledge during the denoising process, Uni-Classifier effectively aligns distributions across multimodal generation tasks. The method is compatible with diverse generative architectures and can either be embedded within cascaded workflows to enhance overall consistency or deployed independently to improve the output quality of a single model. Extensive experiments demonstrate significant performance gains in both video and 3D generation tasks, highlighting its strong generalization capability and broad applicability.

Technology Category

Application Category

📝 Abstract

In practical AI workflows, complex tasks often involve chaining multiple generative models, such as using a video or 3D generation model after a 2D image generator. However, distributional mismatches between the output of upstream models and the expected input of downstream models frequently degrade overall generation quality. To address this issue, we propose Uni-Classifier (Uni-C), a simple yet effective plug-and-play module that leverages video diffusion priors to guide the denoising process of preceding models, thereby aligning their outputs with downstream requirements. Uni-C can also be applied independently to enhance the output quality of individual generative models. Extensive experiments across video and 3D generation tasks demonstrate that Uni-C consistently improves generation quality in both workflow-based and standalone settings, highlighting its versatility and strong generalization capability.

Problem

Research questions and friction points this paper is trying to address.

distributional mismatch

generative models

video generation

3D generation

AI workflows

Innovation

Methods, ideas, or system contributions that make the work stand out.

video diffusion priors

plug-and-play module

distribution alignment

universal guidance classifier

generative model chaining

🔎 Similar Papers

VideoPrism: A Foundational Visual Encoder for Video Understanding

2024-02-20International Conference on Machine LearningCitations: 30

What Matters in Detecting AI-Generated Videos like Sora?

2024-06-27arXiv.orgCitations: 12