- TokenWeave: Efficient Compute-Communication Overlap for Distributed LLM Inference
- emucxl: an emulation framework for CXL-based disaggregated memory applications
Service:
- Artifact Evaluation Committee for ATC'25, OSDI'25, SOSP'25, EuroSys'26
Research Experience
Research Assistant in the AI-Infrastructure team at Microsoft Research India, focusing on optimizing systems for efficient inference of Large Language Models (LLMs) and exploring ways to optimize the serving of vision transformer models.
Education
Graduated from IIT Bombay in 2023 with a B.Tech (with Honors) in Computer Science. Worked with Prof. Purushottam (Puru) Kulkarni on CXL and persistent memory during undergrad.
Background
Research interests lie in computer systems, with a focus on making them more efficient for modern machine learning workloads. Currently, a Research Assistant in the AI-Infrastructure team at Microsoft Research India, working on optimizing systems for efficient inference of Large Language Models (LLMs).
Miscellany
Hobbies include taking walks and reading. Used to write poetry, but has stopped recently. Plans to apply to CS PhD programs this year to start in Fall 2026, and will be working with Prof. Baris Kasikci.