Scholar

Liumeng Xue

Google Scholar ID: KNqxVT0AAAAJ

Hong Kong University of Science and Technology

Audio Speech and Language ProcessingSpeech Generation

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

778

H-index

i10-index

Publications

Co-authors

list available

Contact

Emaillmxue@ust.hk TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

11 items

ISCSLP 2026 CoT-TTS Challenge: Chain-of-Thought Reasoning for Context-Aware Text-to-Speech

2026

Cited

AudioX-Turbo: A Unified Framework for Efficient Anything-to-Audio Generation

2026

Cited

MINT-Bench: A Comprehensive Multilingual Benchmark for Instruction-Following Text-to-Speech

2026

Cited

NVBench: A Benchmark for Speech Synthesis with Non-Verbal Vocalizations

2026

Cited

Iterate to Differentiate: Enhancing Discriminability and Reliability in Zero-Shot TTS Evaluation

2026

Cited

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing

IEEE Journal on Selected Topics in Signal Processing · 2026

Cited

PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation

2025

Cited

YuE: Scaling Open Foundation Models for Long-Form Music Generation

2025

Cited

Resume (English only)

Academic Achievements

Published multiple papers such as 'Audio-FLAN: An Instruction-Following Dataset for Unified Understanding and Generation of Speech, Music, and Sound' and more. Involved in several projects, including the development of the Amphion open-source platform.

Research Experience

Served as a research intern at Microsoft (2019.04 - 2020.06, 2021.11 - 2022.10), Tencent AI Lab (2021.06 - 2021.11), and JD.COM AI Lab (2018.10 - 2019.04).

Education

Received Ph.D. degree from the Audio, Speech and Language Processing Laboratory at Northwestern Polytechnical University, supervised by Prof. Lei Xie.

Background

Currently a Postdoctoral Researcher at Hong Kong University of Science and Technology, working with Prof. Yike Guo and Prof. Wei Xue. Research interests include audio, speech and language processing, as well as audio, music, and speech generation.

Miscellany