🤖 AI Summary
Square tensor networks (TNs) and square circuits offer strong expressivity and closed-form marginalization, but explicit squaring operations incur prohibitive computational costs for partition function evaluation and marginal inference, limiting practical deployment.
Method: We propose a novel parameterization framework that eliminates explicit squaring by jointly enforcing orthogonality constraints and deterministic circuit structure—enabling efficient marginal inference in non-tensor-network square circuits. Our approach integrates unitary matrix regularization, circuit-topology-aware structural constraints, and orthogonality-guided gradient optimization.
Contribution/Results: The method preserves modeling accuracy while substantially accelerating both marginal computation and training. Experiments on distribution estimation tasks demonstrate significant speedups in marginal inference—up to orders of magnitude—without any loss in expressive power. This represents the first orthogonal, deterministic circuit architecture supporting efficient square-circuit inference without explicit squaring.
📝 Abstract
Squared tensor networks (TNs) and their extension as computational graphs--squared circuits--have been used as expressive distribution estimators, yet supporting closed-form marginalization. However, the squaring operation introduces additional complexity when computing the partition function or marginalizing variables, which hinders their applicability in ML. To solve this issue, canonical forms of TNs are parameterized via unitary matrices to simplify the computation of marginals. However, these canonical forms do not apply to circuits, as they can represent factorizations that do not directly map to a known TN. Inspired by the ideas of orthogonality in canonical forms and determinism in circuits enabling tractable maximization, we show how to parameterize squared circuits to overcome their marginalization overhead. Our parameterizations unlock efficient marginalization even in factorizations different from TNs, but encoded as circuits, whose structure would otherwise make marginalization computationally hard. Finally, our experiments on distribution estimation show how our proposed conditions in squared circuits come with no expressiveness loss, while enabling more efficient learning.