Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Conference Paper: Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search, ACL 2025.
Conference Paper: Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers, CVPR 2023.
Conference Paper: Efficient Training of Visual Transformers with Small Datasets, NeurIPS 2021.
Conference Paper: Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation, CVPR 2021.
Conference Paper: Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach, ACM MM 2020.
Journal Article: Spatial Entropy as An Inductive Bias for Vision Transformers, Machine Learning 2024 (Impact Factor: 5.8).
Journal Article: ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image Translation, IEEE TMM 2022 (Impact Factor: 8.4).
Research Experience
Researcher - Kuaishou Technology, Beijing, China, 01/2025 - Present, Research focus: MLLMs, Formal Theorem Proving and Agents.
Researcher - Huawei, Shenzhen, China, 08/2022 - 01/2025, Research focus: Image Generation and Enhancing (GANs and Diffusion Models).
Research Intern - Tencent AI Lab, Shenzhen, China, 2021 - 06/2022, Mentors: Dr. Linchao Bao and Dr. Wei Bi, Research focus: GANs, Image Domain Translation.
PhD Student - FBK and MHUG, Trento, Italy, 12/2018 - 06/2022, Mentors: Prof. Nicu Sebe and Dr. Bruno Lepri, Research focus: Deep learning, GANs, Cross-modal Representations, Image Domain Translation.
Research Intern - Tencent AI Lab, Shenzhen, China, 11/2017 - 09/2018, Mentors: Dr. Wei Bi and Dr. Xiaojiang Liu, Research focus: Deep Learning, Neural Dialogue Generation.
Master Student - Computer Vision and Remote Sensing (CVRS) Lab, Wuhan, China, 03/2015 - 06/2018, Mentor: Prof. Jian Yao, Research focus: Deep Learning, Remote Sensing.
Background
Currently a Researcher at Kuaishou Technology, focusing on cutting-edge research in Computer Vision and Natural Language Processing. Current research interests include Multimodal Large Language Models (MLLMs), Formal Theorem Proving, and AI Agents.
Miscellany
We are actively recruiting daily interns for long-term positions. Please feel free to submit your resume to my email for exciting research opportunities!