🤖 AI Summary
This work addresses the dual challenges of jointly modeling higher-order structural dependencies and vertex features in hypergraph data, and the absence of general-purpose foundational models for hypergraphs. To this end, we propose Hyper-FM—the first hypergraph foundation model designed for cross-domain knowledge extraction. Methodologically, we introduce a hierarchical higher-order neighborhood-guided vertex knowledge embedding mechanism and a multi-hypergraph collaborative structural knowledge extraction framework. Moreover, we establish the first scaling law for hypergraph foundation models, revealing that domain diversity contributes more significantly to performance gains than mere parameter or data scale expansion. Evaluated on a newly constructed benchmark of 10 text-attributed hypergraph datasets, Hyper-FM achieves an average 13.3% improvement over state-of-the-art baselines, empirically validating its cross-domain structural generalization capability and effective knowledge transfer—thereby establishing a new paradigm for hypergraph foundation model research.
📝 Abstract
Hypergraph neural networks (HGNNs) effectively model complex high-order relationships in domains like protein interactions and social networks by connecting multiple vertices through hyperedges, enhancing modeling capabilities, and reducing information loss. Developing foundation models for hypergraphs is challenging due to their distinct data, which includes both vertex features and intricate structural information. We present Hyper-FM, a Hypergraph Foundation Model for multi-domain knowledge extraction, featuring Hierarchical High-Order Neighbor Guided Vertex Knowledge Embedding for vertex feature representation and Hierarchical Multi-Hypergraph Guided Structural Knowledge Extraction for structural information. Additionally, we curate 10 text-attributed hypergraph datasets to advance research between HGNNs and LLMs. Experiments on these datasets show that Hyper-FM outperforms baseline methods by approximately 13.3%, validating our approach. Furthermore, we propose the first scaling law for hypergraph foundation models, demonstrating that increasing domain diversity significantly enhances performance, unlike merely augmenting vertex and hyperedge counts. This underscores the critical role of domain diversity in scaling hypergraph models.