Scholar

Yuhang Zang

Google Scholar ID: hW23VKIAAAAJ

Shanghai AI Laboratory

Natural Language ProcessingVision Language Model

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

6,046

H-index

26

i10-index

42

Publications

20

Co-authors

76

list available

Contact

TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

59 items

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

2026

Cited

0

CapRL++: Unified Reinforcement Learning with Verifiable Rewards for Dense Image and Video Captioning

2026

Cited

0

AdaGRPO: A Capability-Aware Adaptive Enhancement for Flow-based GRPO

2026

Cited

0

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

2026

Cited

0

Pave-GRPO: Beyond Instantaneous Guidance through Principled Average Velocity Decomposition

2026

Cited

0

Skill-as-Pseudocode: Refactoring Skill Libraries to Pseudocode for LLM Agents

2026

Cited

0

ETCHR: Editing To Clarify and Harness Reasoning

2026

Cited

0

SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction

2026

Cited

0

Resume (English only)

Academic Achievements

Multiple papers accepted by top international conferences and journals such as NeurIPS 2025, ICCV 2025, Findings of ACL 2025, ICML 2025, CVPR 2025, ICLR 2025, NeurIPS 2024, ACM MM 2024, ECCV 2024, CVPR 2024, IJCV. Notable works include UnifiedReward-Think, Hi-Flow, Visual-RFT, MM-IFEngine, X-Prompt, Bootstrap3D, Grounded CoT Highlight, Light-A-Video, MIR, SAM2Long, IXC-2.5-Reward, Light-ColPali, VideoRoPE, SongGen, ByTheWay, OVO-Bench, Dispider, PyramidDrop, WildAvatar, MIA-DPO, MotionClone, MMLongbench-Doc, ShareGPT4Video, MMDU, InternLM-XC2-4khd, VideoStreaming, MMStar, VLMEvalKit, Long-CLIP, MVSGaussian, Alpha-CLIP, CascadeMatch, OV-DETR.

Research Experience

Joined Apple (AI/ML) as a research intern in June 2023.

Education

Obtained Bachelor's degree from UESTC in 2019; obtained PhD from Nanyang Technological University in 2023, supervised by Prof. Chen Change Loy.

Background

Current research focuses on 1) post-training for multimodal LLMs (reinforcement fine-tuning, reward models), and 2) vision-language pre-training.

Miscellany

Hobbies and interests not mentioned

Co-authors

76 total

Microsoft GenAI

Shanghai AI Laboratory

The Chinese University of Hong Kong

Shanghai AI Lab | CUHK | PKU

MMLab The Chinese University of Hong Kong

Chen Change Loy

President's Chair Professor, MMLab@NTU, S-Lab, Nanyang Technological University

Shanghai AI Laboratory

Assistant Professor, Hong Kong Baptist University