Scholar
Mubashara Akhtar
Google Scholar ID: x8K6TisAAAAJ
ETH AI Center fellow at ETH Zurich
NLP
Multimodality
Benchmarking & Evaluation
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
433
H-index
11
i10-index
11
Publications
20
Co-authors
10
list available
Contact
No contact links provided.
Publications
9 items
When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation
2026
Cited
0
Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads
2025
Cited
0
Who Evaluates AI's Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations
2025
Cited
0
Compose and Fuse: Revisiting the Foundational Bottlenecks in Multimodal Reasoning
2025
Cited
0
Chimera: Diagnosing Shortcut Learning in Visual-Language Understanding
2025
Cited
0
LEXam: Benchmarking Legal Reasoning on 340 Law Exams
2025
Cited
0
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons
arXiv.org · 2025
Cited
0
Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking
arXiv.org · 2024
Cited
1
Load more
Resume (English only)
Co-authors
10 total
Co-author 1
Elena Simperl
Director, King's Institute for AI & Director of research, Open Data Institute, United Kingdom
Andreas Vlachos
Professor, University of Cambridge
Vivek Gupta
Assistant Professor of Computer Science, Arizona State University
Zhijiang Guo
HKUST (GZ) | HKUST
Co-author 6
Chenxi Pang
Research Software Engineer, Google DeepMind
Julian Martin Eisenschlos
NLP Researcher, Google DeepMind
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up