Jianwei Yang
Scholar

Jianwei Yang

Google Scholar ID: Cl9byD8AAAAJ
Research Scientist, Meta SuperIntelligence Lab
Multimodal Agentic AI
Citations & Impact
All-time
Citations
25,629
 
H-index
51
 
i10-index
72
 
Publications
20
 
Co-authors
69
list available
Resume (English only)
Academic Achievements
  • Pioneered a series of multimodal vision foundation models: UniCL, RegionCLIP, GLIP, Florence.
  • Developed generalist multimodal models: X-Decoder, SEEM, Semantic-SAM.
  • Contributed to large multimodal models: LLaVa variants, SoM Prompting for GPT-4V, Phi-3-Vision (4.2B parameters).
  • Released Magma in Feb 2025, ranked #1 on Hacker News.
  • Released OmniParser in Nov 2024 (5.5k GitHub stars), a pure vision-based UI parser.
  • BiomedParse accepted by Nature Methods and GigaPath by Nature in Sep 2024.
  • Organized the 3rd CVinW Workshop at CVPR 2024.
  • Served as Area Chair for NeurIPS 2024 and ICLR 2025.
  • Delivered invited talks and tutorials at NeurIPS, CVPR, OpenAI Robotics, UW, and Together AI.