🤖 AI Summary
Time-domain astronomy AI faces a critical bottleneck in image preprocessing efficiency: fragmented CPU-based algorithms struggle to support large-scale data volumes and real-time analysis requirements. This paper introduces the first GPU-native preprocessing framework specifically designed for astronomical AI, unifying core workflows—including image quality assessment, astrometric registration, coaddition, background modeling, and photometry—into a single, cohesive system. We propose a novel dual-mode architecture comprising Eager mode (for real-time parameter tuning) and Pipeline mode (for batch deployment), with deep optimizations in CUDA parallelization and memory access patterns. Experiments demonstrate that our framework achieves photometric and astrometric accuracy comparable to state-of-the-art CPU-based methods while accelerating preprocessing by one to two orders of magnitude. The framework is open-sourced via a Docker image and has been rigorously validated on both simulated and real survey data, ensuring full compatibility with mainstream astronomical AI models.
📝 Abstract
The rapid advancement of image analysis methods in time-domain astronomy, particularly those leveraging AI algorithms, has highlighted efficient image pre-processing as a critical bottleneck affecting algorithm performance. Image pre-processing, which involves standardizing images for training or deployment of various AI algorithms, encompasses essential steps such as image quality evaluation, alignment, stacking, background extraction, gray-scale transformation, cropping, source detection, astrometry, and photometry. Historically, these algorithms were developed independently by different research groups, primarily based on CPU architecture for small-scale data processing. This paper introduces a novel framework for image pre-processing that integrates key algorithms specifically modified for GPU architecture, enabling large-scale image pre-processing for different algorithms. To prepare for the new algorithm design paradigm in the AI era, we have implemented two operational modes in the framework for different application scenarios: Eager mode and Pipeline mode. The Eager mode facilitates real-time feedback and flexible adjustments, which could be used for parameter tuning and algorithm development. The pipeline mode is primarily designed for large scale data processing, which could be used for training or deploying of artificial intelligence models. We have tested the performance of our framework using simulated and real observation images. Results demonstrate that our framework significantly enhances image pre-processing speed while maintaining accuracy levels comparable to CPU based algorithms. To promote accessibility and ease of use, a Docker version of our framework is available for download in the PaperData Repository powered by China-VO, compatible with various AI algorithms developed for time-domain astronomy research.