Rithesh Kumar
Scholar

Rithesh Kumar

Google Scholar ID: hJjeVsQAAAAJ
Adobe Research
AudioArtificial IntelligenceDeep Learning
Citations & Impact
All-time
Citations
2,865
 
H-index
8
 
i10-index
8
 
Publications
11
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • - DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis (ICML 2025)
  • - High-Fidelity Audio Compression with Improved RVQGAN (NeurIPS 2023)
  • - VampNet: Music Generation via Masked Acoustic Token Modeling (ISMIR 2023)
  • - Chunked Autoregressive GAN for Conditional Waveform Synthesis (ICLR 2022)
  • - MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis (conference not specified)
Research Experience
  • - Led speech generation research at Adobe Research, including zero-shot voice generation and voice translation.
  • - Served as Technical Lead for Audio Research at Descript Inc., where he developed and shipped multiple text-to-speech models powering the flagship Overdub and Regenerate features.
Education
  • Completed MSc in Computer Science (specializing in Artificial Intelligence) at the Mila lab in Université de Montréal, supervised by Yoshua Bengio. Graduated from SSN College of Engineering (affiliated to Anna University) with a Bachelors in Computer Science and Engineering. In the final 2 years of undergrad, he learned about deep learning, spent a summer at the Serre Lab in Brown University, and collaborated with Prof. Yoshua Bengio at the Mila lab.
Background
  • A Senior Research Scientist on the Speech AI team at Adobe Research, focusing on controllable text-to-speech synthesis, automatic dubbing, and speech editing. His work centers on scaling diffusion models and developing efficient distillation algorithms for multilingual audio generation.
Miscellany
  • Currently living in Toronto, Ontario, Canada.
Co-authors
0 total
Co-authors: 0 (list not available)