Yuwei Fang
Scholar

Yuwei Fang

Google Scholar ID: Om_-hHsAAAAJ
Principal AI Research Scientist, Zoom
Deep LearningNLPMultimodal
Citations & Impact
All-time
Citations
1,968
 
H-index
21
 
i10-index
28
 
Publications
20
 
Co-authors
15
list available
Resume (English only)
Academic Achievements
  • Publications:
  • - VIMI: Grounding Video Generation through Multi-modal Instruction (EMNLP 2024)
  • - Evaluating very long-term conversational memory of llm agents (ACL 2024)
  • - Snap video: Scaled spatiotemporal transformers for text-to-video synthesis (CVPR 2024)
  • - Panda-70m: Captioning 70m videos with multiple cross-modality teachers (CVPR 2024)
  • - Unifying Vision, Text, and Layout for Universal Document Processing (CVPR 2023)
  • - i-Code Studio: A Configurable and Composable Framework for Integrative AI (System Demonstrations on EMNLP 2024)
  • - i-code v2: An autoregressive generation framework over vision, language, and speech data (NAACL 2024)
  • - i-Code: An Integrative and Composable Multimodal Learning Framework (AAAI 2023)
  • - MACSum: Controllable Summarization with Mixed Attributes (TACL 2023)
Research Experience
  • Principal Research Scientist at Zoom AI. Previously worked at Snap Research and Microsoft Azure AI.
Background
  • Research interests are in Multimodal Generation and NLP. Particularly interested in building a unified system that can ground and reason on diversified external world knowledge, to realize multilingual human-machine communication.
Miscellany
  • Contact and social media links:
  • - Email: studyfang AT gmail.com
  • - LinkedIn
  • - Github
  • - Twitter