Scholar
Wenyi Hong
Google Scholar ID: JSEzrlwAAAAJ
Tsinghua University
multimodal pretraining
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
4,964
H-index
13
i10-index
14
Publications
16
Co-authors
6
list available
Contact
No contact links provided.
Publications
11 items
Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification
2026
Cited
0
GLM-OCR Technical Report
2026
Cited
0
UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation
2025
Cited
0
WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation
2025
Cited
0
Glyph: Scaling Context Windows via Visual-Text Compression
2025
Cited
0
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
2025
Cited
0
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
2025
Cited
0
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
arXiv.org · 2024
Cited
261
Load more
Resume (English only)
Co-authors
6 total
Ming Ding
Tsinghua University, Bytedance Seed
Tang Jie
WeBank Chair Professor, Tsinghua University
Yuxiao Dong
CS, Tsinghua University
Wendi Zheng
PHD Student, Tsinghua University
Jiazheng Xu
Tsinghua University
Zhuoyi Yang
Tsinghua University
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up