Cong Wei
Scholar

Cong Wei

Google Scholar ID: y1d5C5YAAAAJ
University of Waterloo
ReasoningDiffusionEfficiency
Citations & Impact
All-time
Citations
2,044
 
H-index
10
 
i10-index
11
 
Publications
14
 
Co-authors
10
list available
Resume (English only)
Academic Achievements
  • - MoCha: Towards Movie-Grade Talking Character Synthesis, NeurIPS 2025 (Spotlight Presentation)
  • - UniVideo: Unified Understanding, Generation, and Editing for Videos, Arxiv 2025
  • - OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision, ICLR 2025
  • - Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers, CVPR 2023
  • - UniIR: Training and Benchmarking Universal Multimodal Information Retrievers, ECCV 2024 (Oral Presentation)
  • - AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks, TMLR 2024 (TMLR Reproducibility Certification)
  • - MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI, CVPR 2024 (Oral Presentation, Best Paper Finalist)
Research Experience
  • - Kuaishou Technology KlingAI, May 2025 - Present, Research Scientist Intern
  • - Meta GenAI, US, Oct 2024 - Apr 2025, Research Scientist Intern
  • - ModiFace, Canada, May 2022 - Nov 2022, Machine Learning Researcher Intern
  • - Vector Institute, Canada, Sep 2020 - Sep 2021, Undergraduate Researcher
Education
  • - University of Waterloo, Canada
  • - PhD in Computer Science, May 2023 - Present, Advisor: Wenhu Chen
  • - University of Toronto, Canada
  • - Master of Science in Applied Computing, Sep 2021 - Jun 2023, Advisor: Florian Shkurti
  • - Honours Bachelor of Science, Sep 2017 - May 2021, Majors: Computer Science, Statistics, Minor: Mathematics, Advisors: David Duvenaud
  • - Vector Institute, Undergraduate Researcher, Advisors: David Duvenaud and Gennady Pekhimenko, Sep 2020 - Sep 2021
Background
  • - Research Interests: Video generation and multi-modal models
  • - Field: Computer Science
  • - Brief Introduction: Building unified models to scale up data usage. Previously, did research on sparse attention.