Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding

📅 2025-03-24

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Current vision-language models (VLMs) are constrained by Euclidean space assumptions and isotropic backbone architectures, limiting their capacity to model multi-geometric astronomical phenomena—such as planetary orbits (spherical geometry) and black hole spacetime (hyperbolic geometry). To address this, we propose the first geometry-aware VLM. Our method constructs multi-scale physical graph representations, integrates spherical and hyperbolic space embeddings, introduces a geometric prompting mechanism, and employs a Mixture-of-Experts (MoE)-style geometric adapter—enabling random-walk token generation and anisotropic structural compression across heterogeneous geometric spaces. We further design a geometry-aware cross-modal alignment pretraining objective for unified multimodal modeling. Evaluated on galaxy property estimation (R² = 0.91) and morphological classification (F1-score improvement of 0.17), our model significantly outperforms both domain-specific and general-purpose VLMs. This work establishes a novel paradigm for cosmological-scale astronomical understanding grounded in differential geometry.

Technology Category

Application Category

📝 Abstract

Modern vision-language models (VLMs) develop patch embedding and convolution backbone within vector space, especially Euclidean ones, at the very founding. When expanding VLMs to a galaxy scale for understanding astronomical phenomena, the integration of spherical space for planetary orbits and hyperbolic spaces for black holes raises two formidable challenges. a) The current pre-training model is confined to Euclidean space rather than a comprehensive geometric embedding. b) The predominant architecture lacks suitable backbones for anisotropic physical geometries. In this paper, we introduced Galaxy-Walker, a geometry-aware VLM, for the universe-level vision understanding tasks. We proposed the geometry prompt that generates geometry tokens by random walks across diverse spaces on a multi-scale physical graph, along with a geometry adapter that compresses and reshapes the space anisotropy in a mixture-of-experts manner. Extensive experiments demonstrate the effectiveness of our approach, with Galaxy-Walker achieving state-of-the-art performance in both galaxy property estimation ($R^2$ scores up to $0.91$) and morphology classification tasks (up to $+0.17$ F1 improvement in challenging features), significantly outperforming both domain-specific models and general-purpose VLMs.

Problem

Research questions and friction points this paper is trying to address.

Extending VLMs to galaxy scale with non-Euclidean geometry integration

Lack of suitable backbones for anisotropic physical geometries

Improving galaxy property estimation and morphology classification accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometry-aware VLM for universe-level tasks

Geometry prompt with multi-space random walks

Mixture-of-experts adapter for space anisotropy

🔎 Similar Papers

No similar papers found.