Wei-Ning Hsu
Scholar

Wei-Ning Hsu

Google Scholar ID: N5HDmqoAAAAJ
Facebook AI Research (FAIR)
Speech ProcessingSpeech SynthesisAudio GenerationMachine Learning
Citations & Impact
All-time
Citations
13,776
 
H-index
42
 
i10-index
76
 
Publications
20
 
Co-authors
22
list available
Resume (English only)
Academic Achievements
  • Audio-Visual HuBERT: the first self-supervised model for audio-visual speech, achieving state-of-the-art performance on lip-reading, speech recognition, and audio-visual speech recognition with much less labeled data; data2vec: The first high-performance self-supervised algorithm that works for speech, vision, and text; Textless Speech-to-Speech Translation on Real Data: first ever text-free speech-to-speech translation model trained on real data that is on par with text-based models; wav2vec-U: an unsupervised speech recognition framework that rivals the best supervised model from 2 years ago and works for 10 languages; Textless NLP: a model that can do prompted or unprompted speech generation without using any text (like audio-version of GPT-2); HuBERT: a state-of-the-art self-supervised speech representation learning model for recognition, generation, and compression
Research Experience
  • Research Scientist at Facebook AI Research (FAIR)
Education
  • B.S. in Electrical Engineering from National Taiwan University in 2014, supervised by Prof. Lin-shan Lee and Prof. Hsuan-Tien Lin; S.M. and Ph.D. in Electrical Engineering and Computer Science from Massachusetts Institute of Technology in 2018 and 2020, respectively, under the supervision of Dr. James Glass
Background
  • Research focuses on representation learning, self-supervised learning, and structured generative modeling for unimodal and multimodal speech. Passionate about reducing the supervision required for various speech applications and developing technologies applicable to both written and unwritten languages.
Miscellany
  • Based in New York, NY, USA