International Conference on Machine Learning · 2023
Cited
60
Resume (English only)
Academic Achievements
ICML 2024: 'Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws' – extends Chinchilla scaling laws to incorporate inference demand.
NeurIPS 2021: 'Scalable Neural Data Server: A Data Recommender for Transfer Learning' – scalable system for recommending pre-training data across domains (e.g., medical, satellite).
IROS 2020 (Best Application Paper Finalist): 'Pit30M: A Benchmark for Global Localization in the Age of Self-Driving Cars' – introduced large-scale LiDAR-based localization dataset.
NeurIPS 2024 Workshop: 'Sparse Upcycling: Inference Inefficient Finetuning' – analyzes trade-offs between model quality and inference efficiency in LLM upcycling.
NeurIPS 2022 Workshop: Investigated scalability and limitations of influence estimation in deep learning, showing many models are needed for reliable estimates.
NeurIPS Meta-Learning Workshop 2021: Studied interaction between BatchNorm’s implicit learning rate decay and meta-learning adaptation (e.g., MAML).
Research Experience
Research Scientist at Databricks/MosaicML (Aug 2023–present): Member of the LLM pre-training team.
Research Scientist Intern at MosaicML (April 2023–Aug 2023): Worked on hyperparameter tuning for LLMs and training/evaluating domain-specific models.
ML Intern (Model Efficiency Team) at Cohere (Oct 2022–): Focused on structured pruning to accelerate LLM inference and training.
Research Intern (ML Algorithms) at Cerebras Systems (April 2022–Aug 2022): Researched unstructured sparsity for accelerating neural network training.
Research Intern at Uber ATG (Jan–July 2019; Sept–Dec 2019): Worked on retrieval-based localization for self-driving cars.