Scholar

Jianwei Yang

Google Scholar ID: Cl9byD8AAAAJ

Research Scientist, Meta SuperIntelligence Lab

Multimodal Agentic AI

Citations & Impact

All-time

Citations

25,629

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

21 items

2026

Cited

2026

Cited

2026

Cited

2026

Cited

2025

Cited

2025

Cited

2025

Cited

2025

Cited

Resume (English only)

Academic Achievements

Pioneered a series of multimodal vision foundation models: UniCL, RegionCLIP, GLIP, Florence.
Developed generalist multimodal models: X-Decoder, SEEM, Semantic-SAM.
Contributed to large multimodal models: LLaVa variants, SoM Prompting for GPT-4V, Phi-3-Vision (4.2B parameters).
Released Magma in Feb 2025, ranked #1 on Hacker News.
Released OmniParser in Nov 2024 (5.5k GitHub stars), a pure vision-based UI parser.
BiomedParse accepted by Nature Methods and GigaPath by Nature in Sep 2024.
Organized the 3rd CVinW Workshop at CVPR 2024.
Served as Area Chair for NeurIPS 2024 and ICLR 2025.
Delivered invited talks and tutorials at NeurIPS, CVPR, OpenAI Robotics, UW, and Together AI.

Co-authors

69 total