🤖 AI Summary
This work investigates the use of graph convolutional networks (GCNs) to construct inference probes capable of accurately identifying the logic gate types of internal nodes in partially observable tree-structured recurrent Boolean circuits while ensuring strong generalization. By integrating one-dimensional low-distortion snowflake embeddings with statistically optimal transport theory within a feasible transductive learning framework, the study elucidates how the structure of computational graphs influences inference efficiency. The key theoretical contribution establishes that the generalization error rate is independent of the graph size and depends solely on the number of queried nodes \(N\), achieving the minimax-optimal rate \(\mathcal{O}(\sqrt{\log(2/\delta)/N})\) with probability at least \(1 - \delta\).
📝 Abstract
We study the statistical behaviour of reasoning probes in a stylized model of looped reasoning, given by Boolean circuits whose computational graph is a perfect $\nu$-ary tree ($\nu\ge 2$) and whose output is appended to the input and fed back iteratively for subsequent computation rounds. A reasoning probe has access to a sampled subset of internal computation nodes, possibly without covering the entire graph, and seeks to infer which $\nu$-ary Boolean gate is executed at each queried node, representing uncertainty via a probability distribution over a fixed collection of $\mathtt{m}$ admissible $\nu$-ary gates. This partial observability induces a generalization problem, which we analyze in a realizable, transductive setting. We show that, when the reasoning probe is parameterized by a graph convolutional network (GCN)-based hypothesis class and queries $N$ nodes, the worst-case generalization error attains the optimal rate $\mathcal{O}(\sqrt{\log(2/\delta)}/\sqrt{N})$ with probability at least $1-\delta$, for $\delta\in (0,1)$. Our analysis combines snowflake metric embedding techniques with tools from statistical optimal transport. A key insight is that this optimal rate is achievable independently of graph size, owing to the existence of a low-distortion one-dimensional snowflake embedding of the induced graph metric. As a consequence, our results provide a sharp characterization of how structural properties of the computational graph govern the statistical efficiency of reasoning under partial access.