Statistical Guarantees for Reasoning Probes on Looped Boolean Circuits

📅 2026-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the use of graph convolutional networks (GCNs) to construct inference probes capable of accurately identifying the logic gate types of internal nodes in partially observable tree-structured recurrent Boolean circuits while ensuring strong generalization. By integrating one-dimensional low-distortion snowflake embeddings with statistically optimal transport theory within a feasible transductive learning framework, the study elucidates how the structure of computational graphs influences inference efficiency. The key theoretical contribution establishes that the generalization error rate is independent of the graph size and depends solely on the number of queried nodes \(N\), achieving the minimax-optimal rate \(\mathcal{O}(\sqrt{\log(2/\delta)/N})\) with probability at least \(1 - \delta\).

Technology Category

Application Category

📝 Abstract
We study the statistical behaviour of reasoning probes in a stylized model of looped reasoning, given by Boolean circuits whose computational graph is a perfect $\nu$-ary tree ($\nu\ge 2$) and whose output is appended to the input and fed back iteratively for subsequent computation rounds. A reasoning probe has access to a sampled subset of internal computation nodes, possibly without covering the entire graph, and seeks to infer which $\nu$-ary Boolean gate is executed at each queried node, representing uncertainty via a probability distribution over a fixed collection of $\mathtt{m}$ admissible $\nu$-ary gates. This partial observability induces a generalization problem, which we analyze in a realizable, transductive setting. We show that, when the reasoning probe is parameterized by a graph convolutional network (GCN)-based hypothesis class and queries $N$ nodes, the worst-case generalization error attains the optimal rate $\mathcal{O}(\sqrt{\log(2/\delta)}/\sqrt{N})$ with probability at least $1-\delta$, for $\delta\in (0,1)$. Our analysis combines snowflake metric embedding techniques with tools from statistical optimal transport. A key insight is that this optimal rate is achievable independently of graph size, owing to the existence of a low-distortion one-dimensional snowflake embedding of the induced graph metric. As a consequence, our results provide a sharp characterization of how structural properties of the computational graph govern the statistical efficiency of reasoning under partial access.
Problem

Research questions and friction points this paper is trying to address.

reasoning probes
looped Boolean circuits
partial observability
generalization error
Boolean gates
Innovation

Methods, ideas, or system contributions that make the work stand out.

reasoning probes
looped Boolean circuits
graph convolutional networks
snowflake metric embedding
statistical generalization
🔎 Similar Papers
No similar papers found.
Anastasis Kratsios
Anastasis Kratsios
McMaster University and Vector Institute
Mathematics of AIGeometric Deep LearningApproximation TheoryLearning TheoryFinance
G
Giulia Livieri
The London School of Economics and Political Science
A
A. M. Neuman
University of Vienna, Faculty of Mathematics