Scholar

Zhaoheng Ni

Google Scholar ID: SYFMSNsAAAAJ

Meta Reality Labs

Speech EnhancementGenerative ModelingNatural Language Processing

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

1,779

H-index

i10-index

Publications

Co-authors

list available

Contact

GitHubOpen ↗LinkedInOpen ↗

Publications

7 items

UrgentMOS: Unified Multi-Metric and Preference Learning for Robust Speech Quality Assessment

2026

Cited

ICASSP 2026 URGENT Speech Enhancement Challenge

2026

Cited

SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training

2026

Cited

Less is More: Data Curation Matters in Scaling Speech Enhancement

2025

Cited

URGENT-PK: Perceptually-Aligned Ranking Model Designed for Speech Enhancement Competition

2025

Cited

Lessons Learned from the URGENT 2024 Speech Enhancement Challenge

2025

Cited

Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding

2024

Cited

Resume (English only)

Academic Achievements

Published multiple papers in conferences such as ICASSP 2025, ASRU 2025, Interspeech 2025, IEEE SLT 2024; Organized events like URGENT 2025 Challenge, Audio Imagination Workshop; Involved in projects like MelodyFlow, FoleyGen, etc.

Research Experience

Senior research scientist at Meta Reality Labs working on generative models for audio, text, and video. Previously, a maintainer of TorchAudio, the official audio library of PyTorch.

Education

PhD student, advised by Michael I Mandel; Undergraduate student, advised by Yan Xu.

Background

Research Interests: Single-channel/multi-channel speech enhancement, generative models, and natural language processing. Recently interested in generative models for music and audio codec.

Miscellany

Personal interests not mentioned

Co-authors

48 total

Shinji Watanabe

Carnegie Mellon University