Hang Wu
Scholar

Hang Wu

Google Scholar ID: D4vnZtEAAAAJ
University of California, Merced
multimodal language modelcomputer vision
Citations & Impact
All-time
Citations
15
 
H-index
2
 
i10-index
1
 
Publications
4
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • - 2025.09: Three papers submitted to ICLR 2026
  • - 2025.08: Paper DiMo-GUI accepted to EMNLP 2025 Main Conference
  • - 2025.05: Two papers submitted to EMNLP 2025
  • - 2024.11: One paper submitted to CVPR 2025
  • - Publications:
  • - RefineShot: Rethinking Cinematography Understanding with Foundational Skill Evaluation
  • - FrameMind: Frame-Interleaved Video Reasoning via Reinforcement Learning
  • - DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning
  • - Structured Attention Matters to Multimodal LLMs in Document Understanding
Research Experience
  • - 2025.03 - 2025.07: Research Intern, vivo@Shenzhen, China
  • - 2025.01 - 2025.07: Research Intern, UC Merced NLP Lab@University of California-Merced, Remote
  • - 2023.09 - 2025.03: Research Intern, Ni’s Group@Tongji University, Shanghai, China
Education
  • - 2025.08 - Present: PhD student, University of California, Merced
  • - 2021.09 - 2025.06: Undergraduate student, Tongji University
Background
  • - Research Interests: Vision-language models and large multimodal models, focusing on improving their performance and specific applications
  • - Education Background: Bachelor's degree from Tongji University, PhD student at the University of California, Merced
  • - Advisors: Prof. Yiwei Wang (primary advisor), Prof. Ming-Hsuan Yang (senior advisor), and works closely with Prof. Yujun Cai
Co-authors
0 total
Co-authors: 0 (list not available)