- Revealing Biased Personality in MLLM: A Study on Personalized Image Aesthetic Assessment (EMNLP, 2025)
- SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning (ICML, 2025)
- ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation (AAAI, 2025)
- LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations (WACV, 2025)
- Modeling Dual-Exposure Quad-Bayer Patterns for Joint Denoising and Deblurring (IEEE Transactions on Image Processing, 2024)
- SVCNet: Real-time Scribble-based Video Colorization with Pyramid Networks (IEEE Transactions on Image Processing, 2023)
- HSGAN: Hyperspectral Reconstruction from RGB Images with Generative Adversarial Network (IEEE Transactions on Neural Networks and Learning Systems, 2023)
- D2HNet: Joint Denoising and Deblurring with Hierarchical Network for Robust Night Image Restoration (ECCV, 2022)
Research Experience
Currently a researcher at 2012 Labs, Huawei Hong Kong Research Center, working on MLLM and AI Agent projects. The team develops multimodal content moderation system and GUI Test Agent, with research results successfully applied in Huawei’s product lines. Former student researcher at AI Imaging Group, SenseTime, working on computational photography research and projects, developed two joint deblurring and denoising systems (for RGB images and RAW images). Former student researcher at Lightspeed and Quantum Studios, Tencent IEG, working on AIGC projects (e.g., stable diffusion).
Education
Received the B.Eng. degree from School of Electronic and Information Engineering (Qiming College), Huazhong University of Science and Technology in June 2018; Ph.D. degree from Department of Electronic Engineering, City University of Hong Kong in February 2023.
Background
Broad interests in AI applications, including low-level vision and computational photography, generative models (e.g., GAN and diffusion model). Recently focuses on applications of Multimodal Large Language Model (MLLM), e.g., AI Agent.