Selected Publications: CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations; CogVLM: Visual Expert for Large Language Models; VidCoM: Fast Video Comprehension through Large Language Models with Multimodal Tools; GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation; Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction (Outstanding Paper Award from EMNLP 2023); Syntactically Robust Training on Partially-Observed Data for Open Information Extraction. Honors & Awards: Outstanding Paper Award from EMNLP 2023, 2023; Dengfeng Scholarship from Tsinghua University, 2023.
Research Experience
Ph.D. student at Tsinghua University; Visiting Research Student at NExT++ Lab, National University of Singapore.
Education
Ph.D. student in Computer Science and Technology at Tsinghua University (2020.09 - Present), advised by Prof. Bin Xu and Prof. Juanzi Li; Visiting Research Student at NExT++ Lab, National University of Singapore (2024.04-2024.10), advised by Prof. Tat-Seng Chua.
Background
Research interests: Multimodal Large Language Models, particularly focused on Training Multimodal Language Models involving Long Context and Videos.