Muhammad Maaz
Scholar

Muhammad Maaz

Google Scholar ID: vTy9Te8AAAAJ
PhD Computer Vision at MBZUAI
Computer VisionDeep LearningVision-LanguageGenerative AI
Citations & Impact
All-time
Citations
4,237
 
H-index
12
 
i10-index
13
 
Publications
17
 
Co-authors
21
list available
Resume (English only)
Academic Achievements
  • - Awarded the Google PhD Fellowship 2025 in Machine Perception
  • - Released Perception Language Model (PLM)
  • - Released VideoGPT+ model, dataset, and benchmark
  • - Released LLaVA++
  • - Perception Language Model - PLM (Spotlight) and Perception Encoder (Oral) accepted to NeurIPS 2025
  • - Video-ChatGPT accepted at ACL 2024
  • - GLaMM accepted at CVPR 2024
  • - Published papers: VideoMathQA, PerceptionLM, VideoGPT+, Video-ChatGPT, Mobile-VideoGPT, etc.
Research Experience
  • - Ph.D. Candidate in the Computer Vision Department at MBZUAI
  • - Research Scientist Intern at Meta, working with Christoph Feichtenhofer
Education
  • Ph.D. in Computer Vision, MBZUAI; Advisors: Dr. Salman Khan and Prof. Fahad Khan.
Background
  • Research Interests: Developing multimodal large language models (MLLMs) for detailed video understanding, multimodal reasoning, and long-video understanding.