Publications: 'Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning' (EMNLP 2025); 'MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation' (EMNLP 2025 Findings); 'SiLVR: A Simple Language-based Video Reasoning Framework' (Arxiv); 'TimeRefine: Temporal Grounding with Time Refining Video LLM' (ArXiv); 'VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos' (CVPR 2025). Awards: 1st place in Track 1B: Multi-Discipline Lecture Understanding at CVPR 2025 Multimodal Video Agent Workshop.
Research Experience
Worked as a research intern at Salesforce (summer 2025, manager: Dr. Juan Carlos Niebles), Meta FAIR Perception team (summer 2024, managers: Dr. Ronghang Hu, and Dr. Christoph Feichtenhofer), and Amazon Alexa AI (collaborated with Dr. Heba Elfardy, Dr. Kevin Small, Dr. Markus Dreyer). Also interned at Tsinghua AIR, working with Prof. Jingjing Liu.
Education
Ph.D. in Computer Science from The University of North Carolina, Chapel Hill, advised by Prof. Mohit Bansal; B.S. from UESTC, advised by Prof. Jingjing Li.
Background
Research interest: video-language understanding and multimodal AI; particularly interested in the challenge of reasoning over long and complex videos.
Miscellany
Actively looking for internship positions for summer 2026. Contact: ziyangw at cs . unc . edu.