π€ AI Summary
This paper addresses the challenge of inaccurate photometric redshift estimation caused by significant disparities in resolution and signal-to-noise ratio (SNR) across multi-band astronomical images. We propose a deep learning model for multi-source fusion that jointly leverages GALEX (ultraviolet), Pan-STARRS (optical), and unWISE (infrared) imaging data to estimate both galaxy spectroscopic redshifts and calibrated conditional redshift density distributions. A key contribution is the systematic comparison of early fusion (image stacking) versus late fusion (feature concatenation) under cross-resolution and cross-SNR conditions, demonstrating the modelβs ability to adaptively weight information across bands. Our architecture combines convolutional neural networks (CNNs) with quantile regression to enable uncertainty-aware density estimation. Evaluated on spectroscopically confirmed galaxies, the model achieves a bias of 0.010, a normalized median absolute deviation (NMAD) of 0.024, a catastrophic outlier rate of 17.53%, and exhibits strong probabilistic calibration of predicted densities.
π Abstract
We present Mantis Shrimp, a multi-survey deep learning model for photometric redshift estimation that fuses ultra-violet (GALEX), optical (PanSTARRS), and infrared (UnWISE) imagery. Machine learning is now an established approach for photometric redshift estimation, with generally acknowledged higher performance in areas with a high density of spectroscopically identified galaxies over template-based methods. Multiple works have shown that image-based convolutional neural networks can outperform tabular-based color/magnitude models. In comparison to tabular models, image models have additional design complexities: it is largely unknown how to fuse inputs from different instruments which have different resolutions or noise properties. The Mantis Shrimp model estimates the conditional density estimate of redshift using cutout images. The density estimates are well calibrated and the point estimates perform well in the distribution of available spectroscopically confirmed galaxies with (bias = 1e-2), scatter (NMAD = 2.44e-2) and catastrophic outlier rate ($eta$=17.53$%$). We find that early fusion approaches (e.g., resampling and stacking images from different instruments) match the performance of late fusion approaches (e.g., concatenating latent space representations), so that the design choice ultimately is left to the user. Finally, we study how the models learn to use information across bands, finding evidence that our models successfully incorporates information from all surveys. The applicability of our model to the analysis of large populations of galaxies is limited by the speed of downloading cutouts from external servers; however, our model could be useful in smaller studies such as generating priors over redshift for stellar population synthesis.