🤖 AI Summary
This paper addresses the attractor decision and construction problems for strings, aiming to improve both efficiency and quality of attractor generation in text compression. For the decision problem—determining whether a given set of positions constitutes an attractor of a string—we propose the first linear-time verification algorithm based on either the suffix automaton (SAM) or the directed acyclic word graph (DAWG). For the construction problem, we design a greedy algorithm that produces near-optimal minimal attractors, with provable approximation guarantees. Both algorithms are theoretically sound and empirically efficient. Experimental evaluation across multiple classical string families demonstrates that our approach rapidly generates high-quality attractors, achieving a significant balance between compression ratio and computational efficiency. The results advance the practical deployment of attractor theory in real-world compression algorithms.
📝 Abstract
The article focuses on word (or string) attractors, which are sets of positions related to the text compression efficiency of the underlying word. The article presents two combinatorial algorithms based on Suffix automata or Directed Acyclic Word Graphs. The first algorithm decides in linear time whether a set of positions on the word is an attractor of the word. The second algorithm generates an attractor for a given word in a greedy manner. Although this problem is NP-hard, the algorithm is efficient and produces very small attractors for several well-known families of words.