Scholar

Haohe Liu

Google Scholar ID: g3O4lJMAAAAJ

Research Scientist at Meta AI

Audio GenerationAudio ClassificationSpeech Quality EnhancementMusic Source Separation

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

3,723

H-index

i10-index

Publications

Co-authors

list available

Contact

CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

19 items

Which Speech Representation Better Matches Text-Native Reasoning? A Study of Speech-Text Alignment on Frame Rate and Representation

2026

Cited

EmoOmni: Bridging Emotional Understanding and Expression in Omni-Modal LLMs

2026

Cited

SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training

2026

Cited

UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities

2025

Cited

Region-Specific Audio Tagging for Spatial Sound

2025

Cited

DreamAudio: Customized Text-to-Audio Generation with Diffusion Models

2025

Cited

DualMark: Identifying Model and Training Data Origins in Generated Audio

2025

Cited

Inference-time Scaling for Diffusion-based Audio Super-resolution

2025

Cited

Resume (English only)

Academic Achievements

First author of several papers including AudioLDM, AudioLDM2, NaturalSpeech, VoiceFixer, SemantiCodec, MusicLDM, AudioSR, etc., with around 3000 citations. Open-source projects/checkpoints on GitHub have received over 9500 stars. Received the Best Technical Paper Award at the 159th AES Convention. Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, IEEE Transactions on Automation Science and Engineering, Reviews in Aquaculture, and more.

Research Experience

Research Scientist at Meta. Member of the EPSRC AI for Sound Project (EP/T019751/1).

Education

PhD student at the Centre for Vision Speech and Signal Processing (CVSSP), University of Surrey. Supervised by Prof. Mark D. Plumbley and co-supervised by Prof. Wenwu Wang. Jointly funded by BBC R&D and the Doctoral College.

Background

Research interests include audio generative models, source separation, quality enhancement, and recognition. Published papers in journals and conferences such as TPAMI, TASLP, JSTSP, ICML, AAAI, NeurIPS, INTERSPEECH, and ICASSP.

Miscellany

Personal interests not mentioned

Co-authors

24 total

Xubo Liu

Meta Superintelligence Labs

Wenwu Wang

Professor, University of Surrey, UK

Co-author 3

Xinhao Mei