Scholar

Baoxiang Wang

Google Scholar ID: cQe4OeYAAAAJ

Assistant Professor, The Chinese University of Hong Kong Shenzhen

reinforcement learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

684

H-index

15

i10-index

19

Publications

20

Co-authors

10

list available

Contact

No contact links provided.

Publications

21 items

PokerSkill: LLMs Can Play Expert-Level Poker without Training or Solvers

2026

Cited

0

The Reciprocity Gradient

2026

Cited

0

Epistemic Gain, Aleatoric Cost: Uncertainty Decomposition in Multi-Agent Debate for Math Reasoning

2026

Cited

0

Talk, Judge, Cooperate: Gossip-Driven Indirect Reciprocity in Self-Interested LLM Agents

2026

Cited

0

The Optimal Token Baseline: Variance Reduction for Long-Horizon LLM-RL

2026

Cited

0

Trust Region Masking for Long-Horizon LLM Reinforcement Learning

2025

Cited

0

Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning

2025

Cited

0

Policy-Conditioned Policies for Multi-Agent Task Solving

2025

Cited

0

Resume (English only)

Co-authors

10 total

Shuai Li (李帅)

Shanghai Jiao Tong University

The Chinese University of Hong Kong, Shenzhen

The Chinese University of Hong Kong, Shenzhen

Zhejiang University

Zhejiang Lab and UCAS and Zhejiang University

Professor of Computer Science, Zhejiang University

Jun XIAO （肖俊）

Institute of Artificial Intelligence, Zhejiang University