🤖 AI Summary
This work addresses the sensitivity of the classical Hausdorff distance to finite perturbations and outliers in infinite sets such as formal languages, which impedes its ability to capture asymptotic similarity. To overcome this limitation, the paper introduces the Asymptotic Hausdorff lifting framework (AHₙ), which extends element-wise metrics—specifically normalized edit distances like Marzal–Vidal’s ned—to a pseudometric over sets. This construction effectively disregards finite disturbances while accurately reflecting the asymptotic edit behavior of languages over infinite domains. The approach yields computable asymptotic Hausdorff distances between regular languages and bounded context-free languages, and further reveals the equivalence class structure of regular languages under this metric, thereby highlighting their inherent structural and asymptotic properties.
📝 Abstract
We introduce the \textit{Asymptotic Hausdorff} lifting, denoted $\mathbb{AH}_{d}$, a general method for lifting an element-level metric $d$ to a (pseudo-) metric on sets, that captures asymptotic similarity in infinite domains equipped with a notion of size. The construction is designed to be insensitive to finite deviations and to avoid the limitations of classical Hausdorff-based approaches, which are often overly sensitive to outliers and fail to reflect asymptotic behavior.
Formal languages provide a central motivating instance of this framework, where elements are words and sets are languages. When applied to normalized edit distances, the Asymptotic Hausdorff lifting yields metric-valued distances between languages that reflect asymptotic edit behavior while preserving metric structure. We study the equivalence classes of regular languages induced by $\mathbb{AH}_{d}$ for normalized edit distances $d$, and characterize their asymptotic essence. Focusing in particular on the normalized edit distance of Marzal and Vidal, $\textsf{ned}$, we investigate the computation of $\mathbb{AH}_{\textsf{ned}}$ for regular languages and for bounded context-free languages.