Scholar

Ganqu Cui

Google Scholar ID: 3IVSzZgAAAAJ

Shanghai AI Lab

LLM AlignmentReinforcement Learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

12,277

H-index

i10-index

Publications

Co-authors

list available

Contact

GitHubOpen ↗

Publications

34 items

Draft-OPD: On-Policy Distillation for Speculative Draft Models

2026

Cited

Post-Trained MoE Can Skip Half Experts via Self-Distillation

2026

Cited

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

2026

Cited

Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning

2026

Cited

TEMPO: Scaling Test-time Training for Large Reasoning Models

2026

Cited

InCoder-32B: Code Foundation Model for Industrial Scenarios

2026

Cited

How Far Can Unsupervised RLVR Scale LLM Training?

2026

Cited

Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

2026

Cited

Resume (English only)

Academic Achievements

Publications: Multiple papers accepted by top conferences such as NeurIPS, EMNLP, ICML; Awards: NSFC Fund (August 2025), WAIC Yunfan Rising Star Award (July 2025), Tsinghua Outstanding Doctoral Dissertation award (July 2024); Projects: PRIME (a scalable reinforcement learning method), Eurus-2-7B-PRIME model outperformed GPT-4o on advanced math benchmarks

Research Experience

Research Scientist at Shanghai AI Laboratory (Since July 2024); Member of THUNLP Lab, Tsinghua University (Until 2025)

Education

Ph.D.: Department of Computer Science and Technology, Tsinghua University, Advisor: Prof. Zhiyuan Liu (Graduated in 2025); B.S.: Mathematics and Physics, Tsinghua University (Graduated in 2019)

Background

Research Interests: LLM alignment and reinforcement learning; Previously, research on representation learning on graphs, especially graph neural networks and their application.

Miscellany