- AAAI 2026: Multi-Agent VLMs Guided Self-Training with PNU Loss
- NeurIPS 2025: Retrv-R1: A Reasoning-Driven MLLM Framework
- ICML 2025: CPCF: A Cross-Prompt Contrastive Framework
- CVPR 2025: POPEN: Preference-Based Optimization and Ensemble
- NeurIPS 2024: Hybrid Mamba for Few-Shot Segmentation
- ICML 2024: Discrete Latent Perspective Learning
- CVPR 2024: LLaFS: When Large Language Models Meet Few-Shot Segmentation
- CVPR 2024: Addressing Background Context Bias in Few-Shot Segmentation through Iterative Modulation
Research Experience
Currently a Research Fellow at the Rapid-Rich Object Search (ROSE) Lab, Nanyang Technological University, working with Professor Bihan Wen. Previously a postdoctoral fellow at City University of Hong Kong, working with Professor Shiqi Wang. Also worked at Megvii and SenseTime. Currently collaborating closely with NVIDIA, Alibaba (Professor Jieping Ye), and Tencent.
Education
Received a bachelor's degree from Beihang University in June 2020; obtained a Ph.D. from the Singapore University of Technology and Design (SUTD) in 2025, supervised by Professor Jun Liu.
Background
Research directions are multimodal learning and computer vision, currently focusing on multimodal large language models (MLLMs) and image segmentation. The research goal is to build efficient, trustworthy, and fine-grained multimodal systems that can process or integrate information from diverse modalities—such as text, images, videos, and data from other sensors—to effectively address a wide range of real-world industrial and scientific challenges.
Miscellany
Open to research collaborations, feel free to drop an email.