Published several papers, including 'MSVIT: Improving Spiking Vision Transformer Using Multi-scale Attention Fusion' (IJCAI 2025), 'Unlocking the Potential of Pre-trained Vision Transformers for Few-Shot Semantic Segmentation through Relationship Descriptors' (CVPR 2024), 'MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss' (arXiv:2402.01157, 2024), and more.
Research Experience
Currently an Associate Lecturer in the School of Systems and Computing at the University of New South Wales (UNSW), Australia. Previously worked as a Research Fellow at the Australian Institute for Machine Learning (AIML) at the University of Adelaide, advised by Prof. Lingqiao Liu.
Education
Received an M.S. degree in Computer Science from the University of Science and Technology of China in 2018, advised by Prof. Shangfei Wang. Completed a Ph.D. in Data Science and Machine Intelligence Lab and the Faculty of Engineering and Information Technology at the University of Technology Sydney in 2021, advised by Prof. Guandong Xu.
Background
Research interests lie in machine learning, computer vision, multimedia, privileged information, and related applications in artificial intelligence, including multi-task learning, fine-grained recognition, music emotion, music composition, and photo aesthetics. Recently, the major research topics are about rationale-guided machine learning and large language models.
Miscellany
Deeply passionate about techniques that can enhance the training and deployment of large language models, with a particular focus on music large language models. Interests include developing improved training methodologies, advanced strategies for better generation control, and optimizing inference times.