Songxiang Liu
Scholar

Songxiang Liu

Google Scholar ID: 4fD1l28AAAAJ
Meituan multi-modal team, PhD (The Chinese University of Hong Kong)
Multi-ModalLLMAudio foundation modelSpeech synthesis
Citations & Impact
All-time
Citations
2,066
 
H-index
23
 
i10-index
38
 
Publications
20
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • 2025: 'ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling' accepted by ICML 2025
  • Apr 2025: Released technical report of Kimi-Audio (with code, model, and paper)
  • 2024: 'UniAudio: Towards Universal Audio Generation with Large Language Models' accepted by ICML 2024
  • 2024: 'InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt' accepted by IEEE/ACM TASLP (corresponding author)
  • 2023: Released InstructTTS on expressive TTS with natural language prompts (arXiv)
  • 2022: Published DiffGAN-TTS paper (arXiv)
  • 2021: DiffSVC paper accepted by ASRU 2021
  • 2021: Published singing voice conversion work using denoising diffusion probabilistic models (DDPM) (arXiv)
  • 2021: FastSVC paper accepted as an oral presentation at ICME 2021
  • Multiple IEEE/ACM TASLP journal papers on voice conversion, emotive speech synthesis, and speech emotion recognition
Co-authors
0 total
Co-authors: 0 (list not available)