Published multiple papers including 'UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG', 'CoDA: Coding LM via Diffusion Adaptation', 'Vlm2vec-v2 (MMEB-V2): Advancing multimodal embedding for videos, images, and visual documents', etc. Several projects have been accepted or released such as NeurIPS 25, ACL findings, CVPR 25, etc.
Research Experience
Started working as a Research Scientist at Salesforce AI Research in 2023.
Education
Earned a Ph.D. from Northeastern University in Boston, USA, in 2023, with research centered around Transfer Learning and Efficient AI. Obtained B.E. degree from Xidian University in Xi’an, China, in 2018.
Background
Currently a Research Scientist at Salesforce AI Research, focusing on Generative AI and Multi-modal Learning, particularly Video/Image to Text (Understanding) and Text to Video/Image (Generation) techniques.