Yunze Man
Scholar

Yunze Man

Google Scholar ID: xvQIEKAAAAAJ
University of Illinois Urbana-Champaign
RoboticsMachine LearningComputer VisionAutonomous Driving
Citations & Impact
All-time
Citations
963
 
H-index
13
 
i10-index
15
 
Publications
20
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • - Publications:
  • - GR00T N1.5: An Improved Open Foundation Model for Generalist Humanoid Robots (CVPR 2025)
  • - Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought (CVPR 2025)
  • - RandAR: Decoder-only Autoregressive Visual Generation in Random Orders (CVPR 2025)
  • - Floating No More: Object-Ground Reconstruction from a Single Image (CVPR 2025)
  • - Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Reasoning (NeurIPS 2024)
  • - SceneCraft: Layout-Guided 3D Scene Generation (NeurIPS 2024)
  • - LLM4Vision: Frozen Transformers from Language Models are Effective Visual Encoder Layers (ICLR 2024, Spotlight)
  • - SituationVLM: Situational Awareness Matters in 3D Vision Language Reasoning (CVPR 2024)
  • - DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception (IROS 2023)
  • - BEV-Guided Multi-Modality Fusion for Driving Perception (CVPR 2023)
  • - Awards:
  • - NVIDIA Graduate Fellowship 2025
  • - NeurIPS 2024 Top Reviewers
  • - CVPR 2024 Outstanding Reviewers
Research Experience
  • - [2024.05 ~ 2024.08] NVIDIA Research, Research Intern, Hosted by Zhiding Yu, De-An Huang, Guilin Liu
  • - [2022.05 ~ 2023.01] Adobe Research, Research Intern, Hosted by Jianming Zhang
Education
  • - Ph.D., Computer Science, University of Illinois Urbana-Champaign, Advisors: Yuxiong Wang, Liangyan Gui
  • - M.S., Robotics, Carnegie Mellon University, Advisor: Kris Kitani
  • - B.S., Computer Science, Zhejiang University
Background
  • Research Interests: Intersection of vision, machine learning, and robotics. Focused on developing vision-centric reasoning models for multimodal and embodied AI agents, object-centric perception systems in dynamic scenes, vision foundation models for open-world scene understanding and generation, and large multimodal models for embodied reasoning and robotics planning.
Miscellany
  • Personal interests and hobbies not provided
Co-authors
0 total
Co-authors: 0 (list not available)