🤖 AI Summary
This work addresses the efficient computation of string periods and shortest covers—key primitives in text compression, computational biology, and pattern recognition. We propose a lightweight encoding strategy based on Character-Distance Sampling (CDS): using the first character as a pivot to sample distances across the string. To our knowledge, this is the first dedicated application of CDS to period and cover analysis, achieving both theoretical elegance and substantial efficiency gains. Our approach eliminates redundant full-string scans inherent in conventional methods and integrates optimized period detection with streamlined cover verification. Experimental evaluation on standard benchmarks demonstrates speedups of 38%–43% for period computation and 63%–72% for shortest cover detection, compared to state-of-the-art baselines. The method is both algorithmically novel and practically deployable, offering a favorable balance between conceptual simplicity and engineering utility.
📝 Abstract
Identifying regularities in strings, such as emph{periods} and emph{covers}, is crucial for applications in text compression, computational biology, and pattern recognition. emph{Characters-Distance-Sampling} ( exttt{CDS}) is an efficient technique that encodes a string by storing distances between selected pivot characters, accelerating string-processing tasks. We apply exttt{CDS} to compute periods and shortest covers, selecting only the first character as the pivot. This strategy yields optimized computations, achieving speedups of $38%$--$43%$ for period computation and $63%$--$72%$ for cover detection. These results demonstrate the potential of exttt{CDS}-based representations for efficient string analysis and broader applications.