NegVQA: Can Vision Language Models Understand Negation? ACL Findings 2025
Why are Visually-Grounded Language Models Bad at Image Classification? NeurIPS 2024
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research.
Research Experience
Worked as an undergraduate visiting research intern at the MARVL lab at Stanford University in summer 2024, advised by Prof. Serena Yeung-Levy.
Education
Tsinghua University, B.S. in Computer Science and Technology (in progress).
Background
A senior undergraduate student at Tsinghua University, majoring in Computer Science and Technology. Research interests include computer vision, natural language processing, and multimodal learning.