Publications: 'ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition' (NeurIPS 2024 D&B); 'Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models' (arXiv); 'CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement' (TMLR'24); 'SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks' (EMNLP'23 Findings); 'Attentional Mixtures of Soft Prompt Tuning for Parameter-efficient Multi-task Knowledge Sharing' (EMNLP'22). Awards: First place prize with team Sherlock AI at AI Tinkerers Gen. AI hackathon; first place in fixie.ai hackathon.
Research Experience
Interned with the AIML team at Apple; involved in multiple research projects including Action Atlas, Molmo, and CLIP meets Model Zoo Experts.
Education
PhD: University of Washington, Computer Science, supervised by Hannaneh Hajishirzi and Ali Farhadi; B.Sc.: Sharif University of Technology, Computer Engineering.
Background
A fourth-year PhD student in computer science at the University of Washington, focusing on pretraining, post-training, and benchmarking multi-modal large language models, specifically video language models. Prior to UW, he received his B.Sc. in computer engineering from Sharif University of Technology.