Scholar
Sijie Zhu
Google Scholar ID: 8aO4k80AAAAJ
Unknown affiliation
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
4,703
H-index
19
i10-index
21
Publications
20
Co-authors
14
list available
Contact
GitHub
Open ↗
LinkedIn
Open ↗
Publications
8 items
Thinking With Bounding Boxes: Enhancing Spatio-Temporal Video Grounding via Reinforcement Fine-Tuning
2025
Cited
0
Vidi2: Large Multimodal Models for Video Understanding and Creation
2025
Cited
0
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
2025
Cited
0
Vidi: Large Multimodal Models for Video Understanding and Editing
2025
Cited
0
Where do Large Vision-Language Models Look at when Answering Questions?
2025
Cited
0
Rethinking Homogeneity of Vision and Text Tokens in Large Vision-and-Language Models
2025
Cited
0
Multi-Reward as Condition for Instruction-based Image Editing
arXiv.org · 2024
Cited
2
Edit3K: Universal Representation Learning for Video Editing Components
arXiv.org · 2024
Cited
2
Resume (English only)
Co-authors
14 total
Chen Chen
Associate Professor, Center for Research in Computer Vision, University of Central Florida
Co-author 2
Mubarak Shah
Trustee Chair Professor of Computer Science, University of Central Florida
Longyin Wen
Bytedance Inc.
Ce Zheng
PostDoc at Carnegie Mellon University; Ph.D., University of Central Florida
Co-author 6
Co-author 7
Linjie Yang
ByteDance Inc.
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up