Mind the Gap: Removing the Discretization Gap in Differentiable Logic Gate Networks

📅 2025-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Logic gate networks (LGNs) suffer from a critical training-inference performance gap—termed the “discretization gap”—alongside low logic gate utilization and slow convergence. To address these issues, we propose a differentiable training framework integrating Gumbel noise and the straight-through estimator (STE). We theoretically establish, for the first time, that Gumbel perturbation implicitly induces Hessian regularization, explaining its dual role in accelerating convergence and narrowing the discretization gap. By modeling logic gates as differentiable operators and employing Gumbel-Softmax sampling, our method ensures 100% gate activation, eliminating unused gates entirely. Experiments demonstrate a 4.5× speedup in training, a 98% reduction in the discretization gap, and significant improvements in both accuracy and inference efficiency on benchmark tasks such as CIFAR-10.

Technology Category

Application Category

📝 Abstract
Modern neural networks demonstrate state-of-the-art performance on numerous existing benchmarks; however, their high computational requirements and energy consumption prompt researchers to seek more efficient solutions for real-world deployment. Logic gate networks (LGNs) learns a large network of logic gates for efficient image classification. However, learning a network that can solve a simple problem like CIFAR-10 can take days to weeks to train. Even then, almost half of the network remains unused, causing a discretization gap. This discretization gap hinders real-world deployment of LGNs, as the performance drop between training and inference negatively impacts accuracy. We inject Gumbel noise with a straight-through estimator during training to significantly speed up training, improve neuron utilization, and decrease the discretization gap. We theoretically show that this results from implicit Hessian regularization, which improves the convergence properties of LGNs. We train networks $4.5 imes$ faster in wall-clock time, reduce the discretization gap by $98%$, and reduce the number of unused gates by $100%$.
Problem

Research questions and friction points this paper is trying to address.

Reduce discretization gap in logic gate networks
Speed up training of logic gate networks
Improve neuron utilization in logic gate networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Gumbel noise for faster training
Applies straight-through estimator technique
Reduces discretization gap significantly
🔎 Similar Papers
No similar papers found.