🤖 AI Summary
This study addresses the critical challenge of efficiently and reliably extracting precise cosmological constraints on dark energy and dark matter from the massive, heterogeneous data streams generated by the Rubin Observatory’s Legacy Survey of Space and Time (LSST). It systematically reviews the current applications of artificial intelligence and machine learning in key tasks—including photometric redshift estimation, transient classification, weak gravitational lensing, and cosmological simulations—and identifies common bottlenecks such as uncertainty quantification, robustness to distributional shifts, and reproducible ensemble methods. The work proposes prioritizing scalable Bayesian inference, physics-informed model integration, active learning, and rigorous validation frameworks, while prospectively exploring the potential of foundation models and large language model–driven agents in scientific workflows. These insights offer a strategic pathway toward building trustworthy, scalable next-generation astronomical data analysis infrastructure, significantly enhancing the efficacy of multi-probe cosmological analyses.
📝 Abstract
The Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) will produce unprecedented volumes of heterogeneous astronomical data (images, catalogs, and alerts) that challenge traditional analysis pipelines. The LSST Dark Energy Science Collaboration (DESC) aims to derive robust constraints on dark energy and dark matter from these data, requiring methods that are statistically powerful, scalable, and operationally reliable. Artificial intelligence and machine learning (AI/ML) are already embedded across DESC science workflows, from photometric redshifts and transient classification to weak lensing inference and cosmological simulations. Yet their utility for precision cosmology hinges on trustworthy uncertainty quantification, robustness to covariate shift and model misspecification, and reproducible integration within scientific pipelines. This white paper surveys the current landscape of AI/ML across DESC's primary cosmological probes and cross-cutting analyses, revealing that the same core methodologies and fundamental challenges recur across disparate science cases. Since progress on these cross-cutting challenges would benefit multiple probes simultaneously, we identify key methodological research priorities, including Bayesian inference at scale, physics-informed methods, validation frameworks, and active learning for discovery. With an eye on emerging techniques, we also explore the potential of the latest foundation model methodologies and LLM-driven agentic AI systems to reshape DESC workflows, provided their deployment is coupled with rigorous evaluation and governance. Finally, we discuss critical software, computing, data infrastructure, and human capital requirements for the successful deployment of these new methodologies, and consider associated risks and opportunities for broader coordination with external actors.