- AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement Learning Framework for Stock Trading (2025)
- Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space (2025)
- Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models (2025)
- SpineBench: Benchmarking Multimodal LLMs for Spinal Pathology Analysis (2025)
- APEX: Empowering LLMs with Physics-Based Task Planning for Real-time Insight (2025)
- ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World (2024)
- CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification (2024)
- Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison (2024)
- CodeJudge-Eval: Can Large Language Models be Good Judges in Code Understanding? (2024)
- CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation (2023)
Research Experience
- 2024.12 - Present: Applied Scientist II, Amazon, Generative Foundation Modeling Team
- 2022.07 - 2023.01: Alibaba Qwen Team, advised by Dr. Wen Wang and Dr. Qian Chen
- 2021.06 - 2021.10: Microsoft Research Asia (MSRA), advised by Dr. Yuanchun Li
Education
- Master of Science in Computer Science, University of California, Santa Barbara (2023.09 - 2024.12)
- Dual Bachelor's Degrees in Engineering and Management, Beijing University of Posts and Telecommunications (2018.09 - 2022.06)
Background
Research interests: code intelligence, AgenticRL, and software automation. Currently working as an applied scientist at Amazon's Generative Foundation Modeling team, focusing on hallucination and code reasoning.
Miscellany
Member of BigCode🤗, an open-source team for intelligent coding.