Scholar
Quanquan Gu
Google Scholar ID: GU9HgNAAAAAJ
Associate Professor of Computer Science, UCLA
AGI
Large Language Models
Reinforcement Learning
Nonconvex Optimization
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
21,045
H-index
69
i10-index
221
Publications
20
Co-authors
77
list available
Contact
CV
Open ↗
Twitter
Open ↗
GitHub
Open ↗
Publications
41 items
Transformers Trained via Gradient Descent Can Provably Learn a Class of Teacher Models
2026
Cited
0
Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence
2026
Cited
0
Near-Optimal Regret for KL-Regularized Multi-Armed Bandits
2026
Cited
0
Protein Autoregressive Modeling via Multiscale Structure Generation
2026
Cited
0
Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics
2026
Cited
0
Deep Delta Learning
arXiv.org · 2026
Cited
3
Group Representational Position Encoding
2025
Cited
0
On the Limits of Test-Time Compute: Sequential Reward Filtering for Better Inference
2025
Cited
0
Load more
Resume (English only)
Academic Achievements
Published numerous papers in top-tier conferences including NeurIPS, ICML, ICLR, and COLT, such as:
“Tensor Product Attention Is All You Need” (NeurIPS 2025, Spotlight)
“MARS: Unleashing the Power of Variance Reduction for Training Large Models” (ICML 2025)
“Self-Play Preference Optimization for Language Model Alignment” (ICLR 2025)
“Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation” (NeurIPS 2024)
“Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models” (ICML 2024)
“Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data” (NeurIPS 2023)
“Why Does Sharpness-Aware Minimization Generalize Better Than SGD?” (NeurIPS 2023)
“Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes” (ICML 2023)
“Benign Overfitting for Two-layer ReLU Convolutional Neural Networks” (ICML 2023)
“Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning” (COLT 2023)
“Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs” (NeurIPS 2022, Oral)
“Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions” (NeurIPS 2022)
Organized the NeurIPS 2023 workshop “New Frontiers of AI for Drug Discovery and Development”
Co-authors
77 total
Dongruo Zhou
Indiana University Bloomington
Difan Zou
The University of Hong Kong
Jiawei Han
Abel Bliss Professor of Computer Science, University of Illinois
Yuan Cao
The University of Hong Kong
Pan Xu
Duke University
Jinghui Chen
Assistant Professor of Information Sciences and Technology, Penn State University
Weitong Zhang
Assistant Professor, SDSS, UNC Chapel Hill
Huizhuo Yuan
Bytedance Seed
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up