Published several papers, such as 'Learning Adaptive Parallel Reasoning with Language Models' (COLM, 2025), which introduced the APR framework to improve the reasoning capabilities of language models; other works include SparseLoRA (ICML'25), Q-Diffusion (ICCV'23), SVDQuant (ICLR'25), etc.
Research Experience
Involved in multiple research projects covering efficient generative models (quantization & sparsity), long-context LLMs/VLMs, ML systems, etc.
Education
Currently a Ph.D. candidate at UC Berkeley, affiliated with Berkeley AI Research (BAIR), advised by Prof. Kurt Keutzer. Received a B.A. in Computer Science and Math from Cornell University, where he worked with Profs. Zhiru Zhang, Vitaly Shmatikov, and Song Han.
Background
Research interests include enhancing the reasoning capabilities of large language models and developing scalable AI agents. Also has broad expertise in making generative models more efficient in both training and inference across language and vision.
Miscellany
Actively seeking full-time member of technical staff and AI researcher/engineer positions in the industry.