Scholar

Shuowei Jin

Google Scholar ID: hxeEwm8AAAAJ

University of Michigan

LLM Post-TrainingEfficient LLM ServingMachine Learning System

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

755

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailjinsw@umich.edu TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

9 items

AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs

2026

Cited

Learning with Rare Success but Rich Feedback via Reflection-Enhanced Self-Distillation

2026

Cited

Rethinking Importance Sampling in LLM Policy Optimization: A Cumulative Token Perspective

2026

Cited

T$^2$PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning

2026

Cited

MARS: Harmonizing Multimodal Convergence via Adaptive Rank Search

2026

Cited

HeaPA: Difficulty-Aware Heap Sampling and On-Policy Query Augmentation for LLM Reinforcement Learning

2026

Cited

HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUs

2025

Cited

Compute Or Load KV Cache? Why Not Both?

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Published several papers including 'Plato: Plan to Efficiently Decode for Large Language Model Inference' (COLM2025), 'Compute Or Load KV Cache? Why Not Both?' (ICML2025), and has been involved in projects such as Cake, Plato, HeterMoE, Eagle, etc.

Research Experience

Current research focuses on improving the efficiency of large language model (LLM) inference through the co-design of algorithms and system architectures. Research areas also include machine learning systems and network systems.

Education

PhD in Computer Science and Engineering from the University of Michigan, 2020-2025, supervised by Prof. Z. Morley Mao; BEng in Computer Science from the School of the Gifted Young, University of Science and Technology of China, 2016-2020.

Background

Currently an Applied Scientist at Amazon working on LLM Post-training Algorithm/System. Research interests lie at the intersection of machine learning systems and network systems, with a focus on enhancing the efficiency of large language model (LLM) inference through the co-design of algorithms and system architectures.

Miscellany