🤖 AI Summary
Existing approaches struggle to effectively model the fine-grained semantic relationships and inherent hierarchical structures between visual semantics and neural responses within Euclidean space. This work proposes the first application of hyperbolic geometry to visual–neural mapping by introducing a Lorentz-model-based hyperbolic joint embedding framework. Leveraging the negative curvature of hyperbolic space as an inductive bias, the method aligns cross-subject fMRI responses with image semantics in a shared hyperbolic manifold through geodesic distance optimization. The proposed approach significantly outperforms current state-of-the-art Euclidean methods on both multi-label semantic prediction and cross-modal retrieval tasks, demonstrating the superiority of hyperbolic space for modeling hierarchical cross-modal semantic relationships.
📝 Abstract
Understanding the intricate mappings between visual stimuli and neural responses is a fundamental challenge in cognitive neuroscience. While current approaches predominantly align images and functional magnetic resonance imaging (fMRI) responses in Euclidean space, this geometry often struggles to preserve fine-grained semantic relationships and latent hierarchical structures across visual and neural modalities. To overcome this, we propose HyNeuralMap, a framework that employ hyperbolic Lorentz model to map visual semantics into a shared, cross-subject neural hierarchy. By leveraging the negative curvature of hyperbolic space as an inductive bias, the proposed framework better captures hierarchical semantic organization and cross-subject neural similarities. Specifically, visual and neural embeddings are jointly optimized through hyperbolic geometric alignment, where geodesic distances preserve semantic proximity and hierarchical relationships more effectively than Euclidean embeddings. Experiments demonstrate that HyNeuralMap consistently outperforms state-of-the-art Euclidean baselines in both multi-label semantic prediction and cross-modal retrieval tasks. This confirms hyperbolic geometry's superiority for cross-modal semantic alignment and hierarchical modeling, providing a new avenue for vision-neural representation learning.