Yupan Huang
Scholar

Yupan Huang

Google Scholar ID: ZbCCBogAAAAJ
Microsoft Research
Multimodal AIGeneral Artificial IntelligenceComputer VisionNatural Language Processing
Citations & Impact
All-time
Citations
1,691
 
H-index
11
 
i10-index
11
 
Publications
14
 
Co-authors
6
list available
Resume (English only)
Academic Achievements
  • "RedStone: Curating General, Code, Math, and QA Data for Large Language Models", arXiv preprint, 2024
  • "Kosmos-2.5: A Multimodal Literate Model", arXiv preprint, 2023 (Equal Contribution)
  • "Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models", ICLR Workshop, 2024
  • "TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering", ECCV 2024 (Oral); Top10 in Hugging Face Space Trending List (Dec 2023)
  • "TextDiffuser: Diffusion Models as Text Painters", NeurIPS 2023 (Equal Contribution); Top10 in Hugging Face Space Trending List (Jun 2023)
  • "LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking", ACM Multimedia 2022 (Oral); Over 100 million downloads on Hugging Face (as of Feb 2024)
Research Experience
  • Jun. 2024 – present: Senior Researcher, General Artificial Intelligence Group, Microsoft Research Asia – Vancouver; Topics: Multimodal AI, General AI, and Large Foundation Models
  • Jan. 2023 – Jun. 2023: Visiting Student, Language Technology Lab, University of Cambridge; Advisor: Prof. Nigel Collier; Topic: multimodal instruction-following models
  • Jul. 2021 – Jun. 2024: Research Intern, Natural Language Computing (now GenAI) Group, Microsoft Research Asia – Beijing; Mentors: Dr. Lei Cui and Dr. Furu Wei; Topics: multimodal document foundation models; visual text rendering with diffusion models
  • Jun. 2019 – Jul. 2021: Research Intern, Multimedia Search and Mining Group, Microsoft Research Asia – Beijing; Mentors: Dr. Bei Liu and Dr. Jianlong Fu; Topics: vision-language pre-training; image-and-text generation
  • Jul. 2017 – Jul. 2018: Research Intern, Multimedia Search and Mining Group, Microsoft Research Asia – Beijing; Mentors: Dr. Qi Dai and Dr. Tao Mei; Topic: video action detection