Efficient Training for Human Video Generation with Entropy-Guided Prioritized Progressive Learning

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational cost and memory footprint of diffusion model training for high-resolution human video generation, this paper proposes an entropy-guided priority progressive learning framework. The method introduces Conditional Entropy Inflation (CEI) to dynamically quantify the importance of model components for the first time, and integrates it with an adaptive progressive training schedule and a convergence efficiency measurement mechanism to enable intelligent allocation of training resources. Evaluated on three standard benchmarks, our approach achieves up to 2.2× faster training speed and 2.4× lower GPU memory consumption compared to baseline methods, without compromising generation quality. The core contribution lies in systematically incorporating information entropy theory into diffusion model training optimization—establishing the first entropy-guided paradigm for efficient video generation training.

Technology Category

Application Category

📝 Abstract
Human video generation has advanced rapidly with the development of diffusion models, but the high computational cost and substantial memory consumption associated with training these models on high-resolution, multi-frame data pose significant challenges. In this paper, we propose Entropy-Guided Prioritized Progressive Learning (Ent-Prog), an efficient training framework tailored for diffusion models on human video generation. First, we introduce Conditional Entropy Inflation (CEI) to assess the importance of different model components on the target conditional generation task, enabling prioritized training of the most critical components. Second, we introduce an adaptive progressive schedule that adaptively increases computational complexity during training by measuring the convergence efficiency. Ent-Prog reduces both training time and GPU memory consumption while maintaining model performance. Extensive experiments across three datasets, demonstrate the effectiveness of Ent-Prog, achieving up to 2.2$ imes$ training speedup and 2.4$ imes$ GPU memory reduction without compromising generative performance.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost and memory usage in human video generation training
Prioritizing critical model components through entropy-guided assessment
Adaptively increasing training complexity while maintaining generative performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Entropy-guided prioritized training of critical components
Adaptive progressive schedule increasing computational complexity
Reduces training time and GPU memory without performance loss
🔎 Similar Papers
No similar papers found.
Changlin Li
Changlin Li
Tencent
Deep LearningComputer Vision
J
Jiawei Zhang
North China Electric Power University
S
Shuhao Liu
North China Electric Power University
Sihao Lin
Sihao Lin
Postdoc, AIML, The University of Adelaide
Artificial intelligencePattern recognitionVision-language model
Z
Zeyi Shi
University of Technology Sydney
Zhihui Li
Zhihui Li
School of Information Science and Technology, University of Science and Technology of China
Artificial IntelligenceMachine LearningMultimedia
X
Xiaojun Chang
University of Science and Technology of China