🤖 AI Summary
ABX discriminative evaluation—widely adopted for assessing phoneme discriminability of speech self-supervised representations—is hindered by tooling scarcity and computational inefficiency. To address this, we introduce FastABX: the first general-purpose, modular, and highly optimized ABX framework, implemented in Python with Cython acceleration, memory-mapped I/O, and batched cosine/Euclidean distance computation. It supports flexible configuration and arbitrary representation inputs. FastABX accelerates ABX task construction and distance computation by over an order of magnitude versus baseline implementations, enabling real-time processing of datasets with millions of samples. Its generic architecture transcends speech-specific constraints, establishing a scalable infrastructure for discriminative evaluation of multimodal representations. The open-source implementation has been integrated into multiple representation learning projects, serving as a community standard for rigorous, efficient ABX benchmarking.
📝 Abstract
We introduce fastabx, a high-performance Python library for building ABX discrimination tasks. ABX is a measure of the separation between generic categories of interest. It has been used extensively to evaluate phonetic discriminability in self-supervised speech representations. However, its broader adoption has been limited by the absence of adequate tools. fastabx addresses this gap by providing a framework capable of constructing any type of ABX task while delivering the efficiency necessary for rapid development cycles, both in task creation and in calculating distances between representations. We believe that fastabx will serve as a valuable resource for the broader representation learning community, enabling researchers to systematically investigate what information can be directly extracted from learned representations across several domains beyond speech processing. The source code is available at https://github.com/bootphon/fastabx.