Scholar

Hang Wu

Google Scholar ID: D4vnZtEAAAAJ

University of California, Merced

multimodal language modelcomputer vision

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

Contact

Publications

3 items

2026

Cited

2026

Cited

2026

Cited

Resume (English only)

Academic Achievements

- 2025.09: Three papers submitted to ICLR 2026
- 2025.08: Paper DiMo-GUI accepted to EMNLP 2025 Main Conference
- 2025.05: Two papers submitted to EMNLP 2025
- 2024.11: One paper submitted to CVPR 2025
- Publications:
- RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
- FrameMind: Frame-Interleaved Video Reasoning via Reinforcement Learning
- DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
- Structured Attention Matters to Multimodal LLMs in Document Understanding

Research Experience

- 2025.03 - 2025.07: Research Intern, vivo@Shenzhen, China
- 2025.01 - 2025.07: Research Intern, UC Merced NLP Lab@University of California-Merced, Remote
- 2023.09 - 2025.03: Research Intern, Ni’s Group@Tongji University, Shanghai, China

Education

Background

- Research Interests: Vision-language models and large multimodal models, focusing on improving their performance and specific applications
- Education Background: Bachelor's degree from Tongji University, PhD student at the University of California, Merced
- Advisors: Prof. Yiwei Wang (primary advisor), Prof. Ming-Hsuan Yang (senior advisor), and works closely with Prof. Yujun Cai

Co-authors

0 total

Co-authors: 0 (list not available)