1. Two works on how negative-reward-only RL can improve LLM reasoning are accepted to NeurIPS 2025.
2. AdaDecode is accepted at ICML 2025.
3. One paper on evaluating LMM's reasoning via chart-to-code generation is accepted at ICLR 2025.
4. One paper on improving MoE models' reasoning via self-contrast has been accepted at NeurIPS 2024.
5. Received NeurIPS Scholar Award (2024), Computer Science Scholar Fellowship (2024), Outstanding Master’s Thesis (2024), and Outstanding Graduate (2024).
Research Experience
1. Apple AIML (Foundation Model Team), 2025.05 -- 2025.09, Mentors: Yun Zhu & Yihao Feng
2. Microsoft Research Asia, 2022.12 -- 2023.05, Mentors: Bei Chen & Jian-Guang LOU
3. International Digital Economy Academy, 2022.01 -- 2022.12, Mentors: Ruyi Gan & Jiaxing Zhang
Education
1. University of Virginia, Ph.D. in Computer Science, 2024 -- Present, Advisor: Prof. Yu Meng
2. Tsinghua University, Master in Electronic and Information Engineering, 2021 -- 2024, Advisor: Prof. Yujiu Yang
3. Xidian University, Bachelor in Electronic Science and Technology, 2017 -- 2021
Background
Research Interests: Large Language Models (LLMs) and Machine Learning, particularly in improving LLMs via reinforcement learning. Long-term research goal is to enable human expert level reasoning, decision-making, and cognitive intelligence for neural models and systems.