Scholar
Daquan Zhou
Google Scholar ID: DdCAbWwAAAAJ
Bytedance, US
Artificial Intelligence
Deep learning
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
14,360
H-index
36
i10-index
46
Publications
20
Co-authors
5
list available
Contact
GitHub
Open ↗
LinkedIn
Open ↗
Publications
15 items
Enhancing Spatial Understanding in Image Generation via Reward Modeling
2026
Cited
0
Rethinking Video Generation Model for the Embodied World
2026
Cited
2
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head
2026
Cited
2
EgoReAct: Egocentric Video-Driven 3D Human Reaction Generation
2025
Cited
0
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
2025
Cited
0
OneVAE: Joint Discrete and Continuous Optimization Helps Discrete Video VAE Train Better
2025
Cited
0
MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Portrait Few-Step Synthesis
2025
Cited
0
AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction
2025
Cited
0
Load more
Resume (English only)
Academic Achievements
HunyuanVideo: A Systematic Framework For Large Video Generative Models
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
Magic-Me: Identity-Specific Video Customized Diffusion
MagicVideo & MagicVideo-V2: Efficient High-Aesthetic Video Generation with Latent Diffusion
DiffFit: Unlocking transferability of large diffusion models via parameter-efficient fine-tuning (CVPR 2023)
Expanding small-scale datasets with guided imagination (NeurIPS 2023, Corresponding Author & Project Lead)
Diffusion probabilistic model made slim (CVPR 2022)
Dataset Quantization (ICCV 2023)
Scaling & Shifting Your Features (NeurIPS 2022 Spotlight, Equal First Author)
Research Experience
Led model pre-training and diffusion algorithm design for HunyuanVideo
Key contributor to video generation projects including StoryDiffusion, Magic-Me, MagicVideo series
Pioneered research on parameter-efficient fine-tuning (e.g., DiffFit), small-data expansion, and diffusion model slimming
Developed PLLaVA: a parameter-free extension of LLaVA from images to videos for dense video captioning
Proposed Dataset Quantization pipeline achieving 5×–10× training speedup (ICCV 2023)
Background
Currently a Tenure-track Assistant Professor at Peking University
Focused on minimizing energy and memory consumption for training and deploying powerful AI algorithms
Applications include Robotics, AIGC, and Vision-Language-Action (VLA) systems
Research interests: explainable video representation design and efficient long video generation (both training and inference)
Strong interest in hardware-algorithm co-design, especially DNN architecture and memory co-optimization
Ongoing work on model and dataset efficiency for discriminative, generative, and multimodal models
Co-authors
5 total
Jiashi Feng
ByteDance Inc.
Qibin Hou
Nankai University
Xiaojie Jin, 靳潇杰
Bytedance Research, USA
Shuicheng Yan, Fellow of AAAI, ACM, SAEng, IEEE, IAPR | Hunting Robotics and Cuda Researchers
Professor@National University of Singapore | Looking for lab members targeting beyond papers
Yunpeng Chen
National University of Singapore
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up