Combining datasets with different ground truths using Low-Rank Adaptation to generalize image-based CNN models for photometric redshift prediction

📅 2026-01-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the complementary limitations of photometric redshifts—broad coverage but low precision—and spectroscopic redshifts—high accuracy but sparse sampling—by proposing a novel fusion approach based on Low-Rank Adaptation (LoRA). Building upon a pre-trained convolutional neural network (CNN), the method efficiently fine-tunes spectroscopic redshift data with minimal computational overhead. To the best of our knowledge, this is the first application of LoRA to astronomical redshift regression, demonstrating marked improvements over conventional transfer learning: it reduces bias by approximately 2.5 times and scatter by about 2.2 times. The approach substantially enhances model generalization while maintaining computational efficiency, offering a new paradigm for astrophysical regression tasks under data sparsity.

Technology Category

Application Category

📝 Abstract
In this work, we demonstrate how Low-Rank Adaptation (LoRA) can be used to combine different galaxy imaging datasets to improve redshift estimation with CNN models for cosmology. LoRA is an established technique for large language models that adds adapter networks to adjust model weights and biases to efficiently fine-tune large base models without retraining. We train a base model using a photometric redshift ground truth dataset, which contains broad galaxy types but is less accurate. We then fine-tune using LoRA on a spectroscopic redshift ground truth dataset. These redshifts are more accurate but limited to bright galaxies and take orders of magnitude more time to obtain, so are less available for large surveys. Ideally, the combination of the two datasets would yield more accurate models that generalize well. The LoRA model performs better than a traditional transfer learning method, with $\sim2.5\times$ less bias and $\sim$2.2$\times$ less scatter. Retraining the model on a combined dataset yields a model that generalizes better than LoRA but at a cost of greater computation time. Our work shows that LoRA is useful for fine-tuning regression models in astrophysics by providing a middle ground between full retraining and no retraining. LoRA shows potential in allowing us to leverage existing pretrained astrophysical models, especially for data sparse tasks.
Problem

Research questions and friction points this paper is trying to address.

photometric redshift
spectroscopic redshift
dataset combination
CNN models
generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-Rank Adaptation
photometric redshift
CNN
transfer learning
spectroscopic redshift
🔎 Similar Papers
No similar papers found.
V
Vikram Seenivasan
Department of Physics and Astronomy, UCLA, Los Angeles, CA 90095
S
Srinath Saikrishnan
Department of Computer Science, UCLA, Los Angeles, CA 90095
Andrew Lizarraga
Andrew Lizarraga
PhD Student @ UCLA
Representation LearningGenerative ModelingStatistics
J
Jonathan Soriano
Department of Physics and Astronomy, UCLA, Los Angeles, CA 90095
B
Bernie Boscoe
Department of Computer Science, Southern Oregon University, Ashland, OR, 97520
T
Tuan Do
Department of Physics and Astronomy, UCLA, Los Angeles, CA 90095