Published multiple papers, including 'Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning' at AAAI 2025, 'A Simple-but-effective Baseline for Training-free Class-Agnostic Counting' at WACV 2025, and 'Unlocking the Potential of Pre-trained Vision Transformers for Few-Shot Semantic Segmentation through Relationship Descriptors' at CVPR 2024.
Research Experience
Was a Postdoctoral Research Fellow at the Australian Institute for Machine Learning (AIML) and Centre for Augmented Reasoning (CAR) at The University of Adelaide, working with A/Prof. Lingqiao Liu. Currently a Machine Learning Engineer at TikTok.
Education
Ph.D. in Computer Science from The University of Adelaide in 2022, supervised by A/Prof. Lingqiao Liu; M.S. in Computer Science from Southeast University, China in 2018, supervised by Prof. Hui Xue; B.S. in Computer Science from China University of Mining and Technology in 2015.
Background
Currently a Machine Learning Engineer at TikTok, focusing on researching and developing a new-generation Multimodal LLM-based AI moderation system to enhance the data trust and safety for the TikTok platform. Interested in general ML and CV research problems, such as MLLMs, SSL, and pretrained foundation models.
Miscellany
Recent research focuses on building knowledge-driven Web Agent.