🤖 AI Summary
This work addresses a critical gap in topological deep learning: the absence of native benchmark datasets that inherently embody higher-order topological structures. Current research often relies on elevating graph-based data to construct such structures, which limits rigorous model evaluation and hinders progress. The study systematically identifies this benchmark deficiency and advocates for the development of genuinely native datasets rooted in intrinsic higher-order topology. Drawing upon established higher-order modeling paradigms—such as message passing frameworks and sheaf theory—the paper proposes concrete directions and standardization guidelines for dataset construction. These contributions aim to establish a solid foundation for topological deep learning and catalyze the future development of benchmarks in higher-order machine learning.
📝 Abstract
After a somewhat rocky start, geometry and topology have established a foothold in machine learning. Message passing, either on graphs or higher-order complexes, is one of the main drivers of geometric deep learning, and paradigms that were once considered to be firmly in the realm of the abstract-like sheaves-have been "tamed" to serve as novel inductive biases for model architectures in topological deep learning. The veritable diversity of models, however, is in stark contrast to the scarcity of suitable benchmark datasets. As a result, researchers often resort to lifting existing graph datasets to include higher-order information. In this opinion paper, I want to encourage the community to also source new datasets, which may be used to prop up the foundations of our research field.