Scholar
Anurag Kumar
Google Scholar ID: HH5cCX0AAAAJ
Google Deepmind
Machine Learning
Speech and Audio Processing
Machine Listening
sound event detection
speech
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
3,620
H-index
25
i10-index
46
Publications
20
Co-authors
21
list available
Contact
CV
Open ↗
Twitter
Open ↗
GitHub
Open ↗
LinkedIn
Open ↗
Publications
1 items
PhaseCoder: Microphone Geometry-Agnostic Spatial Audio Understanding for Multimodal LLMs
2026
Cited
0
Resume (English only)
Academic Achievements
Published extensively in top venues including CVPR, NeurIPS, ICML, ICASSP, Interspeech, IEEE TASLP, and IEEE JSTSP.
Key contributions in Multimodal Understanding/Generation (e.g., xRIR CVPR-2025, VisAH CVPR-2025, AVNeRF NeurIPS-2023, Ego4D CVPR-2022), Speech Enhancement (ICASSP, IEEE JSTSP), and Deep Learning-based Speech Assessment (NeurIPS-2021).
Named to MIT Technology Review Asia Pacific's 'Innovators Under 35' in 2024.
Serves as Associate Editor for IEEE Signal Processing Letters.
Elected to IEEE AASP Technical Committee.
Organizing Audio Imagination Workshop and URGENT Challenge on Speech Enhancement at NeurIPS 2024.
Ego4D paper was a finalist for Best Paper Award at CVPR 2022.
Torchaudio-Squim (for speech quality assessment) released with PyTorch 2.1.
Co-authors
21 total
Bhiksha Raj
Carnegie Mellon University
Buye Xu
Meta Reality Labs Research
Jacob Donley
Meta
Vamsi Krishna Ithapu
Research Scientist, Reality Labs (Meta)
Chenliang Xu
Associate Professor of Computer Science, University of Rochester
Ke Tan
Research Scientist, Meta Reality Labs
Shinji Watanabe
Carnegie Mellon University
Co-author 8
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up