Scholar
Zhiding Yu
Google Scholar ID: 1VI_oYUAAAAJ
Principal Research Scientist & Research Lead, NVIDIA Research
Computer Vsion
Deep Learning
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
24,839
H-index
56
i10-index
89
Publications
20
Co-authors
15
list available
Contact
CV
Open ↗
Twitter
Open ↗
GitHub
Open ↗
LinkedIn
Open ↗
Publications
31 items
ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents
2026
Cited
0
Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline
2026
Cited
0
Stateful Token Reduction for Long-Video Hybrid VLMs
2026
Cited
0
PhyCritic: Multimodal Critic Models for Physical AI
2026
Cited
0
Nemotron ColEmbed V2: Top-Performing Late Interaction embedding models for Visual Document Retrieval
2026
Cited
0
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
2026
Cited
0
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning
2026
Cited
3
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
2025
Cited
0
Load more
Resume (English only)
Academic Achievements
Winner, CVPR24 Challenge on End-to-End Driving at Scale (Hydra-MDP).
2nd Place, CVPR24 Challenge on Driving with Language.
Winner, CVPR23 Challenge on 3D Occupancy Prediction (FB-BEV/FB-OCC).
Winner, ECCV22 Robust Vision Challenge (RVC) on Semantic Segmentation.
Winner, CVPR18 Autonomous Driving Challenge (WAD) on Domain Adaptation.
2nd Place, ICMI15 EmotiW Challenge on Static Facial Expression Recognition.
Best Paper Award, BMVC 2020.
Best Paper Award, WACV 2015.
Best Student Paper Award, ISCSLP 2014.
Most Influential NeurIPS Paper Award (SegFormer).
Numerous publications listed on Google Scholar.
Background
Principal Research Scientist & Research Lead at the Learning & Perception Research Group, NVIDIA Research.
Interested in building general autonomy and intelligence across virtual and physical domains.
Recent focus includes Vision Transformers, LLMs, multimodal LLMs, and vision-language-action (VLA) models.
Applications span open-world understanding, reasoning, AV/robot perception-planning, and agentic systems.
Works are characterized by state-of-the-art performance, scalable architectures, and data-centric strategies for real-world generalization.
Co-authors
15 total
Anima Anandkumar
California Institute of Technology and NVIDIA
Jose M. Alvarez
NVIDIA
Jan Kautz
Vice President of Research, NVIDIA Research
Weiyang Liu
CUHK | Max Planck Institute for Intelligent Systems
Shiyi Lan
NVIDIA
Co-author 6
Xiaodong Yang
NVIDIA Research
Yang Zou
Senior Applied Scientist, Amazon
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up