Two papers presented at ICLR 2024: Scaling laws for sparse models (Spotlight), SpQR
Three papers at NeurIPS 2023: CAP (unstructured pruning), Variance-Reduction interpretation of Knowledge Distillation, ZipLM (structured pruning)
Three papers accepted to ICML 2023: SparseGPT, SparseProp, QSDP
Supervised Elias Frantar’s influential work on LLM compression including GPTQ, Marlin, SparseGPT, QMoE; GPTQ models downloaded millions of times on HuggingFace
Research Experience
Professor at the Institute of Science and Technology Austria (ISTA)
ML Research Lead at Neural Magic, Inc.
Postdoctoral Associate at MIT CSAIL, working with Prof. Nir Shavit