Scholar

Vishaal Udandarao

Google Scholar ID: jUOcawkAAAAJ

PhD Student, University of Tübingen & University of Cambridge

Data-centric MLFoundation ModelsVision and LanguageComputer Vision

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

632

H-index

i10-index

Publications

Co-authors

Contact

Emailvishaal16119@iiitd.ac.in CVOpen ↗TwitterOpen ↗GitHubOpen ↗

Publications

9 items

Concept-Aware Batch Sampling Improves Language-Image Pretraining

2025

Cited

Solving Spatial Supersensing Without Spatial Supersensing

2025

Cited

Data-Centric Lessons To Improve Speech-Language Pretraining

2025

Cited

A Good CREPE needs more than just Sugar: Investigating Biases in Compositional Vision-Language Benchmarks

2025

Cited

A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility

2025

Cited

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

2025

Cited

ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities

arXiv.org · 2024

Cited

Active Data Curation Effectively Distills Large-Scale Multimodal Models

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Published several papers, including 'A Practitioner’s Guide to Continual Multimodal Pretraining', 'No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance', 'Efficient Model Evaluation in an Era of Rapid Progress', 'CiteME: Can Language Models Accurately Cite Scientific Claims?', and 'Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models'.

Research Experience

Interned at Google Zürich, working with Yongqin Xian, Alessio Tonioni, Federico Tombari, and Olivier Henaff. Closely collaborated with Ferjad Naeem, Nikhil Parthasarathy, and Talfan Evans.

Education

Jointly working with Matthias Bethge at The University of Tuebingen and Samuel Albanie at The University of Cambridge/Google Deepmind. Also a part of the International Max Planck Research School for Intelligent Systems. Previously, an MPhil Machine Learning and Machine Intelligence student at The University of Cambridge. Thesis was on 'Understanding and Fixing the Modality Gap in VLMs'. Graduated from IIIT Delhi with a Bachelors in Computer Science in July 2020.

Background

Third-year ELLIS PhD student, with research interests in data-centric machine learning, robustness/generalization to distribution shifts, and foundation models. Mainly focused on understanding the generalization properties of foundation models (like vision-language models and large multi-modal models) through their pre-training and test data distributions.

Miscellany

Previously worked with several mentors, including Ankush Gupta (Google Deepmind), Sungjin Ahn (KAIST), Tanmoy Chakraborty (IIT Delhi), Rajiv Ratn Shah (IIIT Delhi), Saket Anand (IIIT Delhi), Rajesh Kumar (Bucknell University), Anubha Gupta (IIIT Delhi), and Jainendra Shukla (IIIT Delhi).

Co-authors

0 total

Co-authors: 0 (list not available)