Gabriele Oliaro
Scholar

Gabriele Oliaro

Google Scholar ID: 6-evBPAAAAAJ
Carnegie Mellon University, Snowflake AI Research
Machine LearningDistributed SystemsParallel ComputingNetworking
Citations & Impact
All-time
Citations
599
 
H-index
9
 
i10-index
9
 
Publications
10
 
Co-authors
14
list available
Resume (English only)
Academic Achievements
  • First-author paper 'FlexLLM: Token-Level Co-Serving of LLM Inference and Finetuning with SLO Guarantees', NSDI 2026
  • First-author paper 'SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference', NeurIPS 2025 (Spotlight Award)
  • Co-authored 'SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification', ASPLOS 2024 (350+ citations)
  • Co-authored 'Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models', ACL 2024 Oral (Outstanding Paper Award)
  • Contributed to multiple high-impact publications on LLM inference acceleration, speculative decoding, and efficient serving systems (e.g., EuroSys 2026, NeurIPS 2025, ArXiv 2025, ASPLOS 2024, SIGCOMM 2023)