First author of several papers including AudioLDM, AudioLDM2, NaturalSpeech, VoiceFixer, SemantiCodec, MusicLDM, AudioSR, etc., with around 3000 citations. Open-source projects/checkpoints on GitHub have received over 9500 stars. Received the Best Technical Paper Award at the 159th AES Convention. Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, IEEE Transactions on Automation Science and Engineering, Reviews in Aquaculture, and more.
Research Experience
Research Scientist at Meta. Member of the EPSRC AI for Sound Project (EP/T019751/1).
Education
PhD student at the Centre for Vision Speech and Signal Processing (CVSSP), University of Surrey. Supervised by Prof. Mark D. Plumbley and co-supervised by Prof. Wenwu Wang. Jointly funded by BBC R&D and the Doctoral College.
Background
Research interests include audio generative models, source separation, quality enhancement, and recognition. Published papers in journals and conferences such as TPAMI, TASLP, JSTSP, ICML, AAAI, NeurIPS, INTERSPEECH, and ICASSP.