🤖 AI Summary
This work proposes a novel velocity field reconstruction method tailored for galaxy spectroscopic surveys to enhance the signal-to-noise ratio in kinematic Sunyaev–Zel’dovich effect measurements. The approach introduces, for the first time in cosmological velocity reconstruction, an equivariant graph Transformer that respects the broken symmetry along the line of sight, combined with a physics-informed long-wavelength solution as a conditional prior. Trained on only four low-fidelity simulations, the model achieves a 30% and 35% improvement in velocity reconstruction correlation coefficient over physics-based and linear-theory baselines, respectively, when evaluated on high-fidelity mock galaxy catalogs. Moreover, it demonstrates strong zero-shot generalization across varying survey geometries, cosmological parameters, and galaxy samples.
📝 Abstract
Precise measurement of the kinematic Sunyaev-Zel'dovich (kSZ) effect - a probe of the large-scale distribution of baryonic matter, a key observable for cosmological inference - requires accurate reconstruction of galaxy velocities from spectroscopic surveys. The signal-to-noise ratio (SNR) of kSZ measurements scales directly with the correlation coefficient $r$ between reconstructed and true velocities. We introduce Velocityformer, an equivariant graph transformer architecture designed to match the specific symmetry of the observational data. While the underlying physics is equivariant with respect to translations and rotations, observational effects break this symmetry due to the preferred line-of-sight direction. Matching the model's inductive bias to the data's broken symmetry consistently improves performance across all model sizes and training volumes, with Velocityformer improving $r$ by 35% over the standard linear theory baseline and outperforming ML baselines at every data volume. By matching the model's inductive bias to the data and conditioning on the physics-based long-wavelength solution, Velocityformer is highly data-efficient, training to high accuracy on as few as 4 low-fidelity simulations, and generalises zero-shot across input geometry, cosmological parameters, and galaxy sample. On high-fidelity simulated galaxy catalogues, this yields a 30% improvement in $r$ over the physical baseline, directly translating to the same SNR gain on observational data.