Scholar

Zhengyang Liang

Google Scholar ID: 9IC8FBQAAAAJ

Singapore Management University

MultimodalComputer Vision

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

405

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailchr1ce@foxmail.com GitHubOpen ↗

Publications

10 items

DyCo-RL: Dynamic Cross-Modal Coordination for Visual Reasoning

2026

Cited

Explicit Critic Guidance for Aligning Diffusion Models

2026

Cited

DeepXiv-SDK: An Agentic Data Interface for Scientific Papers

2026

Cited

Video-BrowseComp: Benchmarking Agentic Video Research on Open Web

2025

Cited

UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist

2025

Cited

TimeScope: Towards Task-Oriented Temporal Grounding In Long Videos

2025

Cited

Video-XL-2: Towards Very Long-Video Understanding Through Task-Aware KV Sparsification

2025

Cited

MomentSeeker: A Comprehensive Benchmark and A Strong Baseline For Moment Retrieval Within Long Videos

2025

Cited

Resume (English only)

Academic Achievements

Paper 'Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning' accepted by ACMMM 2025
Three papers accepted by CVPR 2025, including 'Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding' (Oral)
Paper 'Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval' accepted by ACL 2025
Released Video-XL-2 model in June 2025
Honorable Mention, Mathematical Contest in Modeling (MCM), USA, 2023
Second Prize in National (First Prize in Beijing), China Undergraduate Mathematical Contest in Modeling, 2022
Published 'Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search' at AAAI 2023 as first student author

Co-authors

3 total

Bo Zhao

Shanghai Jiao Tong University

Zheng LIU

Microsoft Research

Lizi Liao

Singapore Management University