- September 2025: Two papers accepted by NeurIPS 2025.
- August 2025: One paper accepted by TMLR 2025.
- May 2025: One paper accepted by TMLR 2025.
- April 2025: DFloat11: Lossless Compression for LLM reported by 新智元 and 机器之心.
- April 2025: Stop Overthinking survey reported by 新智元.
- March 2025: Released survey: Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models.
- February 2025: Three papers accepted by CVPR 2025, one of which, TopV, marks my first experience as an advisor.
- September 2024: One paper accepted by The Thirty-eighth Annual Conference on Neural Information Processing Systems.
Research Experience
Spring 2024: Research Intern at Snap Research Creative Vision team, proposed 1.99 bits quantization on text-to-image generative model, BitsFusion.
2022: Research Intern at Tencent America Media Lab, exploring efficiency and robustness of Learned Image Compression and Transformer models.
2019: Full-time Algorithm Engineer at JD, working on face verification and recognition.
2018: R&D Intern and member of PaddlePaddle at Baidu, initialized the Paddle-Lite deep learning inference framework.
Education
Ph.D.: Rutgers University, Advisor: Prof. Bo Yuan.
Background
Research Interests: Efficient AI and Trustworthy AI. In the domain of Efficient AI, focuses on developing resource-efficient deep learning models without compromising accuracy or performance. In Trustworthy AI, investigates model vulnerability and robustness through adversarial and backdoor attacks.
Miscellany
Personal Interests: Basketball, DOTA/DOTA2, World of Warcraft. Loves Tracy McGrady, Stephen Curry, Lionel Messi, PIS (YaphetS).