Published multiple high-impact papers at top venues such as NeurIPS, MLSys, and TMLR, including:
— "EXP-Bench: Can AI Conduct AI Research Experiments?" (Arxiv 2025, equal contribution)
— "The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization" (NeurIPS 2025 Spotlight)
— "Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents" (Arxiv 2025, equal contribution)
— "Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming Services" (Arxiv 2024)
— "IaC-Eval: A code generation benchmark for Infrastructure-as-Code programs" (NeurIPS 2024)
— "Venn: Resource Management for Collaborative Learning Jobs" (MLSys 2025)
— "Efficient Large Language Models: A Survey" (TMLR 2024)
— "FedTrans: Efficient Federated Learning via Multi-Model Transformation" (MLSys)
Background
Optimist and strong advocate of AGI
Research interests include AI for Science (AI4S), MLSys, and AI Agents
Focuses on building efficient systems to push the boundaries of machine learning, including LLM serving systems, LLM training systems (e.g., Meta Llama Training Systems), and Agentic RL
Maintains comprehensive paper collections on Private ML Systems and LLM Systems
Open to meaningful research discussions and academic collaborations; contact via email or meeting scheduling