Hongwei Xue
Scholar

Hongwei Xue

Google Scholar ID: k5CJa5YAAAAJ
University of Science and Technology of China
Multi-ModalVision-Language
Citations & Impact
All-time
Citations
865
 
H-index
9
 
i10-index
9
 
Publications
15
 
Co-authors
14
list available
Resume (English only)
Academic Achievements
  • Published several papers, including 'Visual Perception by Large Language Model's Weights' (NeurIPS, 2024), 'Multi-Modal Generative Embedding Model' (Arxiv), 'Stare at What You See: Masked Image Modeling without Reconstruction' (CVPR, 2023), 'CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment' (ICLR, 2023), etc.
Research Experience
  • Worked as a researcher at ByteDance Research. Prior to that, he had experiences at NUS, Tencent WeChat, Shanghai AI Lab, and Microsoft Research Asia (MSRA). He was a main contributor to WeCLIP, a powerful multi-modal foundation model for various WeChat applications. Also contributed to PixelDance, a video generation model.
Education
  • Ph.D. from the University of Science and Technology of China (USTC), advised by Jiebo Luo and Houqiang Li; B.S. from the School of the Gifted Young, USTC.
Background
  • Research interests include Multi-Modal Learning, Computer Vision, and Machine Learning. Much of his research focuses on Vision-and-Language Pre-training.
Miscellany
  • His personal website provides more details about his projects and contact information.