Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current vision-language models (VLMs) are constrained by Euclidean space assumptions and isotropic backbone architectures, limiting their capacity to model multi-geometric astronomical phenomena—such as planetary orbits (spherical geometry) and black hole spacetime (hyperbolic geometry). To address this, we propose the first geometry-aware VLM. Our method constructs multi-scale physical graph representations, integrates spherical and hyperbolic space embeddings, introduces a geometric prompting mechanism, and employs a Mixture-of-Experts (MoE)-style geometric adapter—enabling random-walk token generation and anisotropic structural compression across heterogeneous geometric spaces. We further design a geometry-aware cross-modal alignment pretraining objective for unified multimodal modeling. Evaluated on galaxy property estimation (R² = 0.91) and morphological classification (F1-score improvement of 0.17), our model significantly outperforms both domain-specific and general-purpose VLMs. This work establishes a novel paradigm for cosmological-scale astronomical understanding grounded in differential geometry.

Technology Category

Application Category

📝 Abstract
Modern vision-language models (VLMs) develop patch embedding and convolution backbone within vector space, especially Euclidean ones, at the very founding. When expanding VLMs to a galaxy scale for understanding astronomical phenomena, the integration of spherical space for planetary orbits and hyperbolic spaces for black holes raises two formidable challenges. a) The current pre-training model is confined to Euclidean space rather than a comprehensive geometric embedding. b) The predominant architecture lacks suitable backbones for anisotropic physical geometries. In this paper, we introduced Galaxy-Walker, a geometry-aware VLM, for the universe-level vision understanding tasks. We proposed the geometry prompt that generates geometry tokens by random walks across diverse spaces on a multi-scale physical graph, along with a geometry adapter that compresses and reshapes the space anisotropy in a mixture-of-experts manner. Extensive experiments demonstrate the effectiveness of our approach, with Galaxy-Walker achieving state-of-the-art performance in both galaxy property estimation ($R^2$ scores up to $0.91$) and morphology classification tasks (up to $+0.17$ F1 improvement in challenging features), significantly outperforming both domain-specific models and general-purpose VLMs.
Problem

Research questions and friction points this paper is trying to address.

Extending VLMs to galaxy scale with non-Euclidean geometry integration
Lack of suitable backbones for anisotropic physical geometries
Improving galaxy property estimation and morphology classification accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometry-aware VLM for universe-level tasks
Geometry prompt with multi-space random walks
Mixture-of-experts adapter for space anisotropy
🔎 Similar Papers
No similar papers found.