Mars-Bench: A Benchmark for Evaluating Foundation Models for Mars Science Tasks

📅 2025-10-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The Mars science community has long lacked standardized evaluation benchmarks, hindering the development of domain-specific foundation models. To address this, we introduce Mars-Bench—the first comprehensive benchmark tailored to Mars science tasks—covering both orbital and surface imagery and supporting classification, segmentation, and detection of key geological features including craters, cones, rocks, and frost. Mars-Bench unifies 20 publicly available datasets under a consistent evaluation protocol and provides domain-adapted pretraining baselines leveraging natural images, Earth remote sensing data, and vision-language models. Empirical results demonstrate that Mars-specific pretraining yields substantial performance gains over general-purpose models across all tasks. By establishing a rigorous, reproducible evaluation infrastructure, Mars-Bench fills a critical gap in the field and enables systematic assessment, comparison, and advancement of machine learning models for Mars science.

Technology Category

Application Category

📝 Abstract
Foundation models have enabled rapid progress across many specialized domains by leveraging large-scale pre-training on unlabeled data, demonstrating strong generalization to a variety of downstream tasks. While such models have gained significant attention in fields like Earth Observation, their application to Mars science remains limited. A key enabler of progress in other domains has been the availability of standardized benchmarks that support systematic evaluation. In contrast, Mars science lacks such benchmarks and standardized evaluation frameworks, which have limited progress toward developing foundation models for Martian tasks. To address this gap, we introduce Mars-Bench, the first benchmark designed to systematically evaluate models across a broad range of Mars-related tasks using both orbital and surface imagery. Mars-Bench comprises 20 datasets spanning classification, segmentation, and object detection, focused on key geologic features such as craters, cones, boulders, and frost. We provide standardized, ready-to-use datasets and baseline evaluations using models pre-trained on natural images, Earth satellite data, and state-of-the-art vision-language models. Results from all analyses suggest that Mars-specific foundation models may offer advantages over general-domain counterparts, motivating further exploration of domain-adapted pre-training. Mars-Bench aims to establish a standardized foundation for developing and comparing machine learning models for Mars science. Our data, models, and code are available at: https://mars-bench.github.io/.
Problem

Research questions and friction points this paper is trying to address.

Mars science lacks standardized benchmarks for foundation model evaluation
Mars-Bench addresses the gap in evaluating models for Martian tasks
It provides standardized datasets for orbital and surface imagery analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Mars-Bench benchmark for Mars science evaluation
Provides standardized datasets for classification and segmentation tasks
Evaluates domain-adapted foundation models using orbital imagery
Mirali Purohit
Mirali Purohit
PhD Student, Arizona State University (ASU)
Computer VisionRemote SensingPlanetary ScienceEarth Observation
B
Bimal Gajera
School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
V
Vatsal Malaviya
School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
I
Irish Mehta
School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
K
Kunal Kasodekar
School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
J
Jacob Adler
School of Earth and Space Exploration, Arizona State University, Tempe, AZ, USA
S
Steven Lu
Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA
U
Umaa Rebbapragada
Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA
H
Hannah Kerner
School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA