Scholar

Hangjie Yuan

Google Scholar ID: jQ3bFDMAAAAJ

Alibaba DAMO | ZJU | MMLab@NTU

Generative ModelsMultimodal ModelsFoundation ModelsVideo Understanding

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

2,294

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailhj.yuan@zju.edu.cn TwitterOpen ↗GitHubOpen ↗

Publications

38 items

Towards Error-Free Long Video Generation

2026

Cited

ClinHallu: A Benchmark for Diagnosing Stage-Wise Hallucinations in Medical MLLM Reasoning

2026

Cited

GUI-AC: Enhancing Continual Learning in GUI Agents

2026

Cited

5% > 100%: Flatness Preference is All You Need for Multimodal Parameter-Efficient Fine-Tuning

2026

Cited

Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization

2026

Cited

Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Models

2026

Cited

Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction

2026

Cited

A Faster Path to Continual Learning

2026

Cited

Resume (English only)

Academic Achievements

- Published Papers: UniLumos and VideoMAR accepted to NeurIPS 2025; SAMora, DreamRelation, and FreeScale accepted to ICCV 2025; Dual-Arch and ZeroFlow accepted to ICML 2025; FreeMask and AeroGTO accepted to AAAI 2025; EvolveDirector and C-Flat accepted to NeurIPS 2024; PAPM accepted to ICML 2024, ArchCraft accepted to IJCAI 2024; InstructVideo, DreamVideo, and TF-T2V accepted to CVPR 2024; LUM-ViT accepted to ICLR 2024; VideoComposer accepted to NeurIPS 2023; RLIPv2 accepted to ICCV 2023; RLIP: Relational Language-Image Pre-training accepted to NeurIPS 2022 as a Spotlight paper; Elastic Response Distillation accepted to CVPR 2022; Object-guided Cross-modal Calibration Network accepted to AAAI 2022
- Awards: Special Grant for Postdoctoral Research (Top 10 in Zhejiang Province); Outstanding Research Intern Award (20 in 1000+ candidates) for contribution in video generation to Alibaba; AAAI-22 Scholarship

Research Experience

- Alibaba DAMO Academy: Research Scientist, focusing on cutting-edge problems in foundation models
- Zhejiang University: Research position, working with Prof. Yi Yang
- Representative Projects: VGen (includes InstructVideo), VideoComposer, Lumos series (includes Lumos-1), RLIP series (v1 and v2), DreamVideo series (v1 and v2), and ModelScopeT2V

Education

- Ph.D. Degree: Graduated from Zhejiang University in the summer of 2024, supervised by Prof. Dong Ni, Prof. Samuel Albanie (University of Cambridge/DeepMind), Deli Zhao (Alibaba DAMO), and Shiwei Zhang (Alibaba Tongyi Wan Team)
- Visiting Ph.D. Program: Conducted visiting research at MMLab@NTU, supervised by Prof. Ziwei Liu and Dr. Chenyang Si

Background

- Research Interests: Generative models, representation learning, AI for science and engineering
- Professional Field: Foundation models, visual generation/editing, visual autoregressive models, visual generation alignment, vision-language models, video understanding, visual relation detection (HOI detection/scene graph generation)
- Introduction: A research scientist at Alibaba DAMO Academy, joining via the Alibaba Star program, focusing on cutting-edge problems in foundation models; also holds a research position at Zhejiang University, working with Prof. Yi Yang.

Miscellany

- Recruiting Interns: Recruiting full-time or remote interns to work on cutting-edge problems in foundation models
- Academic Service: Serving as an Area Chair for ICLR 2026

Co-authors

22 total