Scholar

Yuwei Fang

Google Scholar ID: Om_-hHsAAAAJ

Principal AI Research Scientist, Zoom

Deep LearningNLPMultimodal

Citations & Impact

All-time

Citations

1,968

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

3 items

2025

Cited

arXiv.org · 2024

Cited

2024

Cited

Resume (English only)

Academic Achievements

Publications:
- VIMI: Grounding Video Generation through Multi-modal Instruction (EMNLP 2024)
- Evaluating very long-term conversational memory of llm agents (ACL 2024)
- Snap video: Scaled spatiotemporal transformers for text-to-video synthesis (CVPR 2024)
- Panda-70m: Captioning 70m videos with multiple cross-modality teachers (CVPR 2024)
- Unifying Vision, Text, and Layout for Universal Document Processing (CVPR 2023)
- i-Code Studio: A Configurable and Composable Framework for Integrative AI (System Demonstrations on EMNLP 2024)
- i-code v2: An autoregressive generation framework over vision, language, and speech data (NAACL 2024)
- i-Code: An Integrative and Composable Multimodal Learning Framework (AAAI 2023)
- MACSum: Controllable Summarization with Mixed Attributes (TACL 2023)

Research Experience

Principal Research Scientist at Zoom AI. Previously worked at Snap Research and Microsoft Azure AI.

Background

Research interests are in Multimodal Generation and NLP. Particularly interested in building a unified system that can ground and reason on diversified external world knowledge, to realize multilingual human-machine communication.

Miscellany