Yash Savani
Scholar

Yash Savani

Google Scholar ID: WtCWzFwAAAAJ
Ph.D. Student, Carnegie Mellon University
Machine LearningDeep Learning
Citations & Impact
All-time
Citations
919
 
H-index
8
 
i10-index
8
 
Publications
19
 
Co-authors
19
list available
Resume (English only)
Academic Achievements
  • - 2025: Antidistillation Sampling, Safety Pretraining: Toward the Next Generation of Safe AI, Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
  • - 2024: Diffusing Differentiable Representations
  • - 2022: Deep Equilibrium Optical Flow Estimation
  • - 2021: NAS-Bench-x11 and the Power of Learning Curves, Exploring the Loss Landscape in Neural Architecture Search, BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search
  • - 2020: Intra-Processing Methods for Debiasing Neural Networks, A Study on Encodings for Neural Architecture Search
  • - Workshop Publications: 2020: A Study on Encodings for Neural Architecture Search, Local Search is State of the Art for Neural Architecture Search Benchmarks; 2019: Neural Architecture Search via Bayesian Optimization with a Neural Network Prior, Deep Uncertainty Estimation for Model-based Neural Architecture Search
Research Experience
  • - Research Intern at Adobe Research, San Francisco, CA, May 2025 - Nov 2025, worked on improving RL methods to fine tune flow-based models
  • - Research Scientist at Abacus.AI, San Francisco, CA, May 2020 - May 2021, performed research in the AutoML / NAS and Fairness in ML domains, wrote five papers based on this work
  • - Machine Learning Engineer at Abacus.AI, San Francisco, CA, Apr 2019 - May 2020, designed and implemented scalable deep learning solutions
Education
  • - Ph.D. in Computer Science, Carnegie Mellon University, Aug 2021 - Current, Advisor: Prof. Zico Kolter
  • - M.S. in Statistics, Stanford University, Mar 2015 - Jun 2017
  • - B.S. in Computer Science, Stanford University, Sep 2013 - Jun 2017
Background
  • Ph.D. student in Computer Science at Carnegie Mellon University, working on steering frontier generative AI models toward greater safety, robustness, and efficiency. His research connects the mathematics of high-dimensional learning (differential geometry, stochastic differential equations, optimal transport) with methods for training and steering generative models (spanning pretraining, fine-tuning, reinforcement learning, and controlled decoding), and brings these ideas to life at scale using PyTorch, JAX, CUDA, Triton, and modern distributed systems (DeepSpeed, FSDP, Megatron).
Miscellany
  • If you're interested in discussing new ideas or collaborating, feel free to drop him an email or schedule a meeting with him.