Scholar
Saurav Muralidharan
Google Scholar ID: GXlChWcAAAAJ
NVIDIA
Efficient Deep Learning
Large Language Models
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
538
H-index
11
i10-index
11
Publications
20
Co-authors
6
list available
Contact
Twitter
Open ↗
GitHub
Open ↗
LinkedIn
Open ↗
Publications
5 items
Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs
2025
Cited
0
Small Language Models are the Future of Agentic AI
2025
Cited
0
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
2025
Cited
0
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
2025
Cited
0
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
arXiv.org · 2024
Cited
0
Resume (English only)
Academic Achievements
Published 'Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning' at NeurIPS 2025.
Published 'LlamaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing' at ICLR 2025.
Published 'Compact Language Models via Pruning and Knowledge Distillation' at NeurIPS 2024.
Published 'MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models' at NeurIPS 2024 (Spotlight).
Released 'LLM Pruning and Distillation in Practice: The Minitron Approach' on arXiv 2024.
Published 'Flextron: Many-in-One Flexible Large Language Model' at ICML 2024 (Oral).
Released 'HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity' on arXiv 2023.
Published 'Uniform Sparsity in Deep Neural Networks' at MLSys 2023.
Co-authors
6 total
Pavlo Molchanov
NVIDIA Research
Michael Garland
Senior Director of Research, NVIDIA
Jan Kautz
Vice President of Research, NVIDIA Research
Bryan Catanzaro
NVIDIA
Mary Hall
Professor, School of Computing, University of Utah
Co-author 6
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up