Yuexiang Zhai
Scholar

Yuexiang Zhai

Google Scholar ID: 78WTKm4AAAAJ
UC Berkeley | Google DeepMind
Reinforcement LearningData LabelingEvaluation
Citations & Impact
All-time
Citations
2,400
 
H-index
18
 
i10-index
19
 
Publications
20
 
Co-authors
53
list available
Resume (English only)
Academic Achievements
  • Co-first author of 'SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training' at ICML 2025.
  • First author of 'Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning' at NeurIPS 2024.
  • Co-author of 'Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs', presented orally at CVPR 2024.
  • First author of 'Investigating the Catastrophic Forgetting in Multimodal Large Language Model Fine-Tuning' at CPAL 2024.
  • Co-first author of 'Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning' at NeurIPS 2023.
  • Co-first author of 'Understanding the Complexity Gains of Single-Task RL with a Curriculum' at ICML 2023.
  • First author of 'Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning' published in JAIR 2022.
  • Contributed to the Gemini 2.5 project on advanced reasoning, multimodality, long context, and next-generation agentic capabilities.
Background
  • Currently a Member of Technical Staff at xAI.
  • Research interests span machine learning, reinforcement learning, and large models.
  • Curious about things not yet understood, such as how the universe works and what his cats are thinking.
  • Recently developed an interest in critiquing 'fake papers'—those that appear fancy but lack real substance.
  • Currently focused on near-future (≤5 years) practical directions like data cleaning, evaluation, post-training, and agents, especially in large multimodal models.