Scholar
Shuai Bai
Google Scholar ID: ylhI1JsAAAAJ
Qwen Team, Alibaba Group
Multi-Modal Learning
Visual Generation
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
17,689
H-index
23
i10-index
25
Publications
20
Co-authors
18
list available
Contact
No contact links provided.
Publications
22 items
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
2026
Cited
0
FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies
2026
Cited
0
CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents
2026
Cited
0
MPDocBench-Parse: Benchmarking Practical Multi-page Document Parsing
2026
Cited
0
Qwen3-VL-Seg: Unlocking Open-World Referring Segmentation with Vision-Language Grounding
2026
Cited
0
CC-OCR V2: Benchmarking Large Multimodal Models for Literacy in Real-world Document Processing
2026
Cited
0
GenMask: Adapting DiT for Segmentation via Direct Mask
2026
Cited
0
Learning Transferable Temporal Primitives for Video Reasoning via Synthetic Videos
2026
Cited
0
Load more
Resume (English only)
Co-authors
18 total
Junyang Lin
Qwen Team, Alibaba Group & Peking University
Co-author 2
Jingren Zhou
Alibaba Group, Microsoft
Co-author 4
Co-author 5
Hongxia Yang
Professor, HK Polytechnic University
Dayiheng Liu (刘大一恒)
Qwen Team, Alibaba Group
Rui Men
Qwen Team, Alibaba Group & Peking University
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up