Scholar

Huazheng Wang

Google Scholar ID: w3PrbKwAAAAJ

Assistant Professor, Oregon State University

Reinforcement LearningMachine LearningInformation Retrieval

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

1,518

H-index

i10-index

Publications

Co-authors

list available

Contact

GitHubOpen ↗

Publications

12 items

Who&When Pro: Can LLMs Really Attribute Failures in AI Agents?

2026

Cited

Online KL-Regularized Reinforcement Learning with Function Approximation under Misspecification

2026

Cited

EvoPool: Evolutionary Programmatic Annotation for Label-Efficient Specialized Supervision

2026

Cited

Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism

2026

Cited

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs

2026

Cited

MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

2026

Cited

Towards Affordable Energy: A Gymnasium Environment for Electric Utility Demand-Response Programs

2026

Cited

EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

2026

Cited

Resume (English only)

Academic Achievements

Received EECS Fabulous Teacher Recognition in June 2025.
Two papers accepted by ICML 2025: one spotlight paper on failure attribution of multi-agent LLMs and one on principal-agent bandits.
Talk at AAAI 2025 New Faculty Highlight: “Efficient and Robust Reinforcement Learning from Human Feedback”.
One paper on analyzing gradient entanglement of DPO and its variants is accepted by ICLR 2025.
Talk at CS colloquium series, University of Rochester: “Robust Reinforcement Learning from Biased Human Feedback and Corruption: Theory and Algorithms”.
One paper on risk-aware preference-based RL is accepted by NeurIPS 2024.
Received a new NSF award (IIS-2403401) on Neural Bandits in August 2024.
One paper on conversational dueling bandits is accepted by KDD 2024, and another paper on adversarial attack on combinatorial bandits is accepted by ICML 2024.
One paper on federated pure exploration is accepted by UAI 2024.
One paper on policy alignment is accepted by ICLR 2024.
Two papers accepted by AAAI 2024: one on tree search bandits for protein optimization and one on stealthy attack against MAB.
One paper on offline RL for learning to rank is accepted by NeurIPS 2023.
One paper on representation learning in POMDP is accepted by ICML 2023.
Asynchronous kernel bandits paper is accepted by ICLR 2023.
Two papers accepted by NeurIPS 2022: one on distributed kernel bandits and the other on Thompson Sampling for Directed Evolution.
Awarded ICML 2021 Best Reviewers (Top 10%) in August 2021.
Received SIGIR 2019 Best Paper Award in August 2019.
Bloomberg Data Science Ph.D. Fellowship from 2018 to 2021.

Background

Research interests include reinforcement learning, information retrieval, and machine learning in general. Recently focused on developing provably efficient and trustworthy reinforcement learning and multi-armed bandit algorithms with applications to recommendation, ranking, LLM agents, and scientific discovery problems in biology and chemistry.

Miscellany

Looking for one self-motivated PhD student with solid math and coding backgrounds starting Fall 2026.

Co-authors

4 total

Qingyun Wu

The Pennsylvania State University

Hongning Wang

Associate Professor, Department of Computer Science and Technology, Tsinghua University

Mengdi Wang

Professor, Princeton AI Lab, CSML&ECE, Princeton University

Quanquan Gu

Associate Professor of Computer Science, UCLA