Scholar

Yuetai Li

Google Scholar ID: 6Sof_XAAAAAJ

University of Washington

LLM AgentLLM ReasoningPost-trainingTrustworthy AI

Citations & Impact

All-time

Citations

227

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

15 items

2026

Cited

2026

Cited

2026

Cited

2026

Cited

2025

Cited

2025

Cited

2025

Cited

2025

Cited

Resume (English only)

Academic Achievements

Proposed 'Small Model Learnability Gap': small models perform better on shorter, simpler reasoning chains rather than long CoT or distillation from large teachers
Discovered that RL-trained math models generalize well to non-reasoning domains (e.g., alignment), while SFT models lose this capacity; identified sampling policy as key to generalization
Identified 'Temporal Forgetting': 76.7% of AIME problems were correctly solved at intermediate checkpoints during RL training of Deepseek-R1-1.5B, but only 30% remained correct in the final model; proposed 'Temporal Sampling' to leverage training dynamics for answer diversity
Introduced SafeChain dataset to improve safety alignment without compromising reasoning; showed that long CoT does not necessarily enhance safety
TinyV: proposed a lightweight LLM-based verifier to address >38% false negatives in answer verification during RL training, improving reward estimation accuracy
Visual Sphinx: developed a four-stage pipeline generating 660K visual logic data for RL training of multimodal reasoning models
ICLR BiAlign Workshop (Oral), awarded 'Best Honorable Mention'

Co-authors

10 total