Scholar

Yuhao Dong

Google Scholar ID: kMui170AAAAJ

Tsinghua University, Nanyang Technological University

Multi-modal LearningComputer Vision

Google Scholar↗

Citations & Impact

All-time

Citations

1,092

H-index

14

i10-index

14

Publications

20

Co-authors

14

list available

Contact

No contact links provided.

Publications

27 items

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

2026

Cited

0

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

2026

Cited

0

From Pixels to Words -- Towards Native One-Vision Models at Scale

2026

Cited

0

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV

2026

Cited

0

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

2026

Cited

0

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

2026

Cited

0

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

2026

Cited

0

PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning

2026

Cited

0

Resume (English only)

Co-authors

14 total

Associate Professor, Nanyang Technological University

Tencent Hunyuan

Tsinghua University

PhD Candidate, MMLab@NTU

PhD Student@NTU, Singapore

University of Washington, Allen Institute for AI

Ph.D. Student, Nanyang Technological University