ML Research Scientist, Yandex Research, Nov 2023–Feb 2024, Moscow – Led LLM compression research, achieved 87% model size reduction with minimal performance loss, accelerated inference by up to 320% using Triton and C++, and integrated framework into Hugging Face transformers library
Research Intern, KAUST Optimization and Machine Learning Lab, Jul–Sep 2023, Saudi Arabia – Conducted research under Prof. Peter Richtárik on correlated quantization
ML Engineer Intern (NLP), Yandex, Mar–Jul 2022, Moscow – Enabled efficient tabular data insertion for map-reduce LLM inference (120% speedup) and increased test coverage from 0 to 85%
Researcher, Terra Quantum AG, Jul 2020–Jul 2022, Moscow – Researched quantum algorithms for business applications, developed an NMR spectra analysis tool, and optimized LLM deployment for chat assistants (40% latency reduction)