Scholar

Jiarui Yao

Google Scholar ID: 84fexSEAAAAJ

CS, UIUC

Reinforcement LearningMachine LearningLarge Language Models

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

130

H-index

i10-index

Publications

Co-authors

list available

Contact

CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

16 items

Predictive Divergence Masks for LLM RL

2026

Cited

GRAIN: Group Aggregation via Min-Norm Objective

2026

Cited

Rethinking the Divergence Regularization in LLM RL

2026

Cited

AgentSPEX: An Agent SPecification and EXecution Language

2026

Cited

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

2026

Cited

PRL: Process Reward Learning Improves LLMs'Reasoning Ability and Broadens the Reasoning Boundary

2026

Cited

ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

2025

Cited

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

2025

Cited

Resume (English only)

Academic Achievements

Released GVM - Gradient Variance Minimization, a framework to improve data sampling efficiency in LLMs math reasoning.
Wrote a report analyzing what makes GRPO 'stand out' for math reasoning, with some understanding and ablation studies to compare different algorithms for LLMs reasoning training.
Released FANS - Formal Answer Selection for Natural Language Reasoning Using Lean4, enhancing test-time math answer selection using formal language.
Published the paper 'Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL'.
Published the paper 'A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce'.
Published the paper 'FANS – Formal Answer Selection for Natural Language Math Reasoning Using Lean4'.

Research Experience

Started the PhD journey in the CS School of UIUC in August 2024.

Education

First-year CS PhD student in the Siebel School of Computing and Data Science, University of Illinois, Urbana-Champaign (UIUC), supervised by Prof. Tong Zhang; Bachelor of Engineering from Yao Class, Tsinghua University.

Background

Main research interests focus on reinforcement learning, large language models, especially autonomous agents learning, model reasoning, and interdisciplinary fields.

Miscellany