π€ AI Summary
This work investigates the statistical query (SQ) learning complexity of semi-automata under uniform input distributions, asking whether SQ hardness can be characterized solely by the internal transition structure when the alphabet size and input length are polynomial in the number of states $n$.
Method: We model the transition structure of semi-automata as a random walk on the symmetric group $S_n imes S_n$, and apply Fourier analysis and representation theory to derive tight bounds on the spectral gap of this walkβthereby quantifying the exponential decay of state correlations with walk length.
Contribution/Results: We prove that SQ hardness arises intrinsically from algebraic properties of the transition structure, independent of the computational complexity of the recognized language. Specifically, for input length $mathrm{poly}(n)$, any two distinct semi-automata become nearly orthogonal after polynomially many steps, rendering SQ learning information-theoretically insufficient and fundamentally hard.
π Abstract
Semiautomata form a rich class of sequence-processing algorithms with applications in natural language processing, robotics, computational biology, and data mining. We establish the first Statistical Query hardness result for semiautomata under the uniform distribution over input words and initial states. We show that Statistical Query hardness can be established when both the alphabet size and input length are polynomial in the number of states. Unlike the case of deterministic finite automata, where hardness typically arises through the hardness of the language they recognize (e.g., parity), our result is derived solely from the internal state-transition structure of semiautomata. Our analysis reduces the task of distinguishing the final states of two semiautomata to studying the behavior of a random walk on the group $S_{N} imes S_{N}$. By applying tools from Fourier analysis and the representation theory of the symmetric group, we obtain tight spectral gap bounds, demonstrating that after a polynomial number of steps in the number of states, distinct semiautomata become nearly uncorrelated, yielding the desired hardness result.