🤖 AI Summary
Astronomical time series pose significant challenges for semantic modeling due to their extreme length, irregular sampling, and multimodal heterogeneity. To address this, we propose the first self-supervised learning framework integrating diffusion models with the Perceiver architecture. Methodologically, we employ a Perceiver encoder for efficient input compression and a Perceiver-IO diffusion decoder for high-fidelity sequence reconstruction, augmented by a self-supervised masked reconstruction objective as a strong baseline. Our key contributions are: (i) the first adaptation of diffusion generative paradigms to irregular scientific time series; and (ii) a scalable tokenization-based encoder-decoder design that unifies heterogeneous inputs—including multi-source spectroscopic and photometric data. On benchmark astronomical datasets, our approach significantly reduces reconstruction error compared to VAE and MAE baselines, yields more discriminative latent representations, and better preserves fine-grained structural patterns—such as transient light-curve morphology.
📝 Abstract
Self-supervised learning has become a central strategy for representation learning, but the majority of architectures used for encoding data have only been validated on regularly-sampled inputs such as images, audios. and videos. In many scientific domains, data instead arrive as long, irregular, and multimodal sequences. To extract semantic information from these data, we introduce the Diffusion Autoencoder with Perceivers (daep). daep tokenizes heterogeneous measurements, compresses them with a Perceiver encoder, and reconstructs them with a Perceiver-IO diffusion decoder, enabling scalable learning in diverse data settings. To benchmark the daep architecture, we adapt the masked autoencoder to a Perceiver encoder/decoder design, and establish a strong baseline (maep) in the same architectural family as daep. Across diverse spectroscopic and photometric astronomical datasets, daep achieves lower reconstruction errors, produces more discriminative latent spaces, and better preserves fine-scale structure than both VAE and maep baselines. These results establish daep as an effective framework for scientific domains where data arrives as irregular, heterogeneous sequences.