π€ AI Summary
Representation similarity measurement lacks rigorous formal definitions and systematic evaluationβa fundamental open problem in machine learning. This paper introduces ReSi, the first comprehensive benchmark grounded in a formal definition of representation similarity. ReSi encompasses six test categories, 23 similarity metrics (including CKA, CCA, and SVCCA), 11 neural architectures, and six cross-modal datasets spanning vision, language, and graph domains. It enables multi-dimensional, cross-architecture, and cross-dataset comparability of representations. Methodologically, we propose a structured, verifiable, and cross-domain extensible benchmarking framework, establishing a unified testing paradigm and standardized evaluation protocols to fill the longstanding gap in systematic assessment. Empirical results comprehensively expose performance disparities among existing metrics across tasks and architectures. All code, data, and protocols are publicly released to ensure reproducibility and foster community-driven advancement.
π Abstract
Measuring the similarity of different representations of neural architectures is a fundamental task and an open research challenge for the machine learning community. This paper presents the first comprehensive benchmark for evaluating representational similarity measures based on well-defined groundings of similarity. The representational similarity (ReSi) benchmark consists of (i) six carefully designed tests for similarity measures, (ii) 23 similarity measures, (iii) eleven neural network architectures, and (iv) six datasets, spanning over the graph, language, and vision domains. The benchmark opens up several important avenues of research on representational similarity that enable novel explorations and applications of neural architectures. We demonstrate the utility of the ReSi benchmark by conducting experiments on various neural network architectures, real world datasets and similarity measures. All components of the benchmark are publicly available and thereby facilitate systematic reproduction and production of research results. The benchmark is extensible, future research can build on and further expand it. We believe that the ReSi benchmark can serve as a sound platform catalyzing future research that aims to systematically evaluate existing and explore novel ways of comparing representations of neural architectures.