Scholar

Zhengyang Tang

Google Scholar ID: 2RRV0PQAAAAJ

CUHKSZ

Large Language ModelsMathematical ReasoningInformation Retrieval

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

3,963

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailzhengyangtang@link.cuhk.edu.cn CVOpen ↗TwitterOpen ↗GitHubOpen ↗

Publications

15 items

PhoneWorld: Scaling Phone-Use Agent Environments

2026

Cited

The Missing Piece in Pre-trained Model Evaluation: Reward-Guided Decoding Unlocks Task-Oriented Behavior Without Parameter Updates

2026

Cited

Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents

2026

Cited

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

2026

Cited

AlphaInventory: Evolving White-Box Inventory Policies via Large Language Models with Deployment Guarantees

2026

Cited

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

2026

Cited

Do Phone-Use Agents Respect Your Privacy?

2026

Cited

Teaching Language Models to Reason with Tools

2025

Cited

Resume (English only)

Academic Achievements

NeurIPS 2025: CoRT (Code-integrated Reasoning within Thinking)
COLM 2025: SCRIT (Self-Evolving Critique Abilities in LLMs)
ICML 2024: MathScale (Scaling Instruction Tuning for Mathematical Reasoning)
TMLR 2025: GLAN (Generalized Instruction Tuning for Language Models)
ACL 2025: Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion (Oral & Panel)
COLING 2022: DPTDR (Deep Prompt Tuning for Dense Passage Retrieval)
Operations Research 2025: ORLM (Customizable Framework for Automated Optimization Modeling)
Contributed to Qwen3 Technical Report (Tool-integrated Reasoning)

Background

Ph.D. candidate at The Chinese University of Hong Kong, Shenzhen, advised by Prof. Benyou Wang.
Research focuses on developing intelligent agents capable of complex reasoning and self-improvement.
Pioneers agentic frameworks leveraging reinforcement learning (RL) for tool-integrated tasks.
Proposed SCRIT, a self-evolving critique model serving as a generative reward model for scalable, supervision-free oversight.
Designed novel instruction tuning frameworks—MathScale, GLAN, and ALAN—for scalable high-quality training data generation.
Also works on efficient information access methods (e.g., DPTDR), achieving top performance on competitive benchmarks.

Co-authors

9 total