Zhuofan Xia
Scholar

Zhuofan Xia

Google Scholar ID: m2M6b58AAAAJ
PhD candidate, Tsinghua University
Efficient Deep LearningComputer VisionMultimodal Learning
Citations & Impact
All-time
Citations
2,454
 
H-index
11
 
i10-index
11
 
Publications
14
 
Co-authors
27
list available
Resume (English only)
Academic Achievements
  • ECCV 2024: Agent Attention: On the Integration of Softmax and Linear Attention
  • CVPR 2024: GSVA: Generalized Segmentation via Multimodal Large Language Models
  • Preprint: DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
  • ICCV 2023: Adaptive Rotated Convolution for Rotated Object Detection
  • CVPR 2023: Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention
  • ICLR 2023: Budgeted Training for Vision Transformer
  • CVPR 2022: Vision Transformer with Deformable Attention (Best Paper Finalists)
  • CVPR 2021: 3D Object Detection with Pointformer
  • Preprint: Demystify Mamba in Vision: A Linear Attention Perspective
  • Honors and Awards: Multiple scholarships from Tsinghua University, including the Friend of Tsinghua – Ubiquant Scholarship, Hefei Talent Scholarship, Samsung Scholarship, etc.
Research Experience
  • Currently focusing on topics related to dynamic and efficient large multimodal models.
Background
  • Fourth-year Ph.D. candidate at the Department of Automation, Tsinghua University, advised by Prof. Gao Huang and Prof. Shiji Song. Research mainly focuses on deep learning in computer vision and multimodal learning, specifically in Vision Transformers (2D/3D), dynamic neural architectures, and large multimodal models.
Miscellany
  • No additional information provided