Scholar

Siyan Zhao

Google Scholar ID: Q7c4FaYAAAAJ

University of California Los Angeles

Large Language ModelsReinforcement LearningMachine Learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

165

H-index

i10-index

Publications

Co-authors

Contact

Emailsiyanz@g.ucla.edu CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

9 items

Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

2026

Cited

The performances of the Chinese and U.S. Large Language Models on the Topic of Chinese Culture

arXiv.org · 2026

Cited

Accelerating Inference of Masked Image Generators via Reinforcement Learning

2025

Cited

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

2025

Cited

Inpainting-Guided Policy Optimization for Diffusion Large Language Models

2025

Cited

d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning

2025

Cited

Multi-fidelity Reinforcement Learning Control for Complex Dynamical Systems

2025

Cited

Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs

2025

Cited

Resume (English only)

Academic Achievements

Paper 'd1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning' accepted at NeurIPS 2025 as spotlight; 'Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs' accepted to ICLR 2025 as oral presentation; 'Probing the Decision Boundaries of In-context Learning in LLMs' accepted at NeurIPS 2024 and won the best paper award at the Foundation Model Interventions Workshop, NeurIPS 2024; received the 2024 Amazon Fellowship.

Research Experience

Before my PhD, I worked on 3D perception and RL algorithms for autonomous driving agents.

Education

Bachelor’s degree from the Engineering Science (Machine Intelligence program) at the University of Toronto. Currently a PhD student in Computer Science at UCLA, advised by Professor Aditya Grover.

Background

A 4th-year PhD student in Computer Science at UCLA, advised by Professor Aditya Grover. My primary research interest lies in endowing machines with human-like reasoning and efficiency. Recent research focuses on understanding and scaling (diffusion) LLM reasoning via RL, efficient preference alignment & personalization, and LLM inference efficiency & modular design for RL.

Co-authors

0 total

Co-authors: 0 (list not available)