1. Paper “Prompt Curriculum Learning for Efficient LLM Post-Training” on arXiv
2. Paper “Pre-trained Large Language Models Learn Hidden Markov Models In-context” at NeurIPS 2025
3. Paper “Accelerating RL for LLM Reasoning with Optimal Advantage Regression” at NeurIPS 2025
4. Paper “Value-Guided Search for Efficient Chain-of-Thought Reasoning” at NeurIPS 2025
5. Paper “Q#: Provably Optimal Distributional RL for LLM Post-Training” at NeurIPS 2025
6. Paper “Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF” at ICLR 2025
7. Paper “End-to-end Training for Recommendation with Language-based User Profiles” at CIKM 2025
8. Paper “REBEL: Reinforcement Learning via Regressing Relative Rewards” at NeurIPS 2024
9. Paper “Session-based Recommendation With Transformers” at RecSys Challenge 2022
10. Paper “Mitigating the Filter Bubble while Maintaining Relevance: Targeted Diversification with VAE-based Recommender Systems” at SIGIR 2022
11. Paper “MCL: Mixed-Centric Loss for Collaborative Filtering” at WWW 2022
12. Paper “Shoestring: Graph-Based Semi-Supervised Classification With Severely Limited Labeled Data” at CVPR 2020
Research Experience
1. Ph.D. student at Cornell University
2. Part-time Researcher at Meta Superintelligence
3. Paper accepted (poster and oral) to New York Reinforcement Learning Workshop 2025
Education
1. Cornell University, Ph.D. student in Computer Science, Advisors: Thorsten Joachims and Wen Sun
2. University of Toronto, Bachelor's degree in Computer Engineering, Advisors: Baochun Li, Scott Sanner, Maksims Volkovs
Background
Research interests include reinforcement learning, natural language processing, and recommendation systems. Published in multiple top conference papers.
Miscellany
Part-time content creator with over 50,000 followers and 10 million views on Bilibili, Douyin, and YouTube