1. Diffsound: Discrete Diffusion Model for Text-to-sound generation, IEEE Transactions on Audio, Speech and Language Processing, 2023.
2. InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt, IEEE Transactions on Audio, Speech and Language Processing, 2024.
3. UniAudio: An Audio Foundation Model Toward Universal Audio Generation, ICML, 2024.
4. UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner, NIPS, 2024.
5. ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling, ICML, 2025.
6. SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models, IEEE Transactions on Audio, Speech and Language Processing (TASLP), 2025.
7. SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models, Proc. Interspeech, 2024.
8. Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models, ICML, 2023.
9. A Mixed Supervised Learning Framework for Target Sound Detection, DCASE Workshop, 2022.
10. AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head, AAAI 2024, 2023.
2. May 2021 - May 2023, Tencent AI Lab, Speech Group, Intern, Supervisors: Songxiang Liu, Chao Weng, and Bo Wu.
Education
1. The Chinese University of Hongkong, School of Electronic and Computer Engineering, PhD in progress, August 2023 - Now.
2. Peking University, School of Computer Engineering and Science, Master, August 2020 - August 2023.
3. Shanghai University, August 2016 - July 2020.
Background
Research interests include Audio Foundation Models, Generative Models, Large Language Models, and Audio/Speech Processing. Currently a PhD student at The Chinese University of Hongkong, supervised by Prof. Helen Meng. Received the Master's Degree from Peking University in 2023.
Miscellany
Actively looking for collaboration opportunities, e.g., Audio Foundation Models, Generative Models, TTS, Text-to-audio.