Xuenan Xu
Scholar

Xuenan Xu

Google Scholar ID: e0h0ae8AAAAJ
Shanghai Jiao Tong University
audio generationaudio understandingspeech synthesis
Citations & Impact
All-time
Citations
908
 
H-index
17
 
i10-index
25
 
Publications
20
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • Published multiple papers in the field of audio understanding and generation. Specific projects include: enhanced accuracy, diversity, temporal accuracy, and efficiency in audio captioning; task and weakly-supervised training paradigm for text to audio grounding; BLAT, Auto-ACD, detailed audio-text simulation; visually-enhanced diverse generation; PicoAudio with a temporal-sensitive evaluation benchmark; Audio Codec for Audio LLM (SemantiCodec); content creation with LLM agent, e.g., AI storytelling for children.
Research Experience
  • Mainly focuses on general audio understanding and generation, including tasks such as audio captioning, text to audio grounding, audio-text retrieval, and text to audio generation. Also interested in speech/music understanding and generation and their interaction with general audio.
Education
  • 2019.9 - 2025.6, Ph.D., Shanghai Jiao Tong University, supervised by Prof. Mengyue Wu and Prof. Kai Yu; 2023.10 - 2024.4, visiting Ph.D., University of Surrey, supervised by Prof. Mark D. Plumbley and Prof. Wenwu Wang; 2015.9 - 2019.6, Bachelor, Shanghai Jiao Tong University, supervised by Leyun Wang.
Background
  • A fourth year Ph.D. candidate from X-LANCE Lab, Shanghai Jiao Tong University, supervised by Prof. Mengyue Wu and Prof. Kai Yu. Research interests include audio/speech/music understanding and generation, and large language models.
Miscellany
  • Expected to graduate in June 2025 and open to job opportunities in 2025. Can be contacted via LinkedIn or WeChat.
Co-authors
0 total
Co-authors: 0 (list not available)