Scholar

Chao-Han Huck Yang

Google Scholar ID: TT3XJW8AAAAJ

Sr. Research Scientist, NVIDIA Research

Robust Speech RecognitionLanguage ModelsPost-TrainingSequence Modeling

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

3,362

H-index

i10-index

Publications

Co-authors

list available

Contact

TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

39 items

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

2026

Cited

How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation

2026

Cited

The Interspeech 2026 Audio Reasoning Challenge: Evaluating Reasoning Process Quality for Audio Reasoning Models and Agents

2026

Cited

Bagpiper: Solving Open-Ended Audio Tasks via Rich Captions

2026

Cited

Quantum LEGO Learning: A Modular Design Principle for Hybrid Artificial Intelligence

2026

Cited

PRiSM: Benchmarking Phone Realization in Speech Models

2026

Cited

Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception

2026

Cited

Continual Quantum Architecture Search with Tensor-Train Encoding: Theory and Applications to Signal Processing

2026

Cited

Resume (English only)

Academic Achievements

Introduced the first prompt-adaptation method to frozen acoustic models [ICML 21], nominated for best paper; explored the first series of n-best hypotheses based generative error correction (ASR & translation pre-training & post-training) [ASRU 23]; co-invented Whispering-LLaMA [EMNLP 23] and HyPoradise [NeurIPS 23], and speech post-training to LLM [ICLR 24]. Honorable mention for best industry paper on multimodal n-best correction [ACL 25].

Research Experience

Senior Research Scientist at NVIDIA; previously a full-time employee at Amazon AGI, working with Andreas Stolcke in Ivan Bulyko's team; also an intern at Google Speech & Brain teams (now DeepMind), collaborating with Bo Li, Yu Zhang, and Tara N. Sainath's team.

Education

Ph.D., Georgia Institute of Technology, advised by Prof. Chin-Hui Lee. Before starting his Ph.D., visited Prof. Jesper Tegnér's group on self-evolutionary algorithms and interned at TSMC in mixed-signal IC design.

Background

Focused on speech-language alignment and scaling laws. Prior to joining NVIDIA, worked full-time at Amazon AGI with Andreas Stolcke, and interned at Google Speech & Brain teams (now DeepMind), co-hosted by Bo Li and Yu Zhang in Tara N. Sainath's team.

Co-authors

28 total