Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits

📅 2020-04-13

🏛️ International Conference on Machine Learning

📈 Citations: 139

✨ Influential: 11

career value

197K/year

🤖 AI Summary

Probabilistic circuits (PCs) have long suffered from inefficient training, poor scalability, and high memory overhead due to sparse computational graphs, hindering their application in large-scale image generation and exact probabilistic inference. This paper introduces the first PC implementation architecture built upon a single, large-scale einsum tensor operation, simplifying the EM algorithm and tightly integrating automatic differentiation for end-to-end differentiable training. It marks the first successful training of deep PCs on image-level benchmarks—including SVHN and CelebA. The approach accelerates training speed and improves memory efficiency by two orders of magnitude, enables high-fidelity unsupervised image generation, and preserves exact support for complex probabilistic queries such as marginalization and conditional inference. The core innovations are an einsum-driven unified computational paradigm and a fully differentiable PC training framework.

📝 Abstract

Probabilistic circuits (PCs) are a promising avenue for probabilistic modeling, as they permit a wide range of exact and efficient inference routines. Recent ``deep-learning-style'' implementations of PCs strive for a better scalability, but are still difficult to train on real-world data, due to their sparsely connected computational graphs. In this paper, we propose Einsum Networks (EiNets), a novel implementation design for PCs, improving prior art in several regards. At their core, EiNets combine a large number of arithmetic operations in a single monolithic einsum-operation, leading to speedups and memory savings of up to two orders of magnitude, in comparison to previous implementations. As an algorithmic contribution, we show that the implementation of Expectation-Maximization (EM) can be simplified for PCs, by leveraging automatic differentiation. Furthermore, we demonstrate that EiNets scale well to datasets which were previously out of reach, such as SVHN and CelebA, and that they can be used as faithful generative image models.

Problem

Research questions and friction points this paper is trying to address.

Improving training efficiency of probabilistic circuits for real-world data

Enabling scalable probabilistic modeling with novel einsum-operation design

Extending tractable inference to complex datasets like SVHN and CelebA

Innovation

Methods, ideas, or system contributions that make the work stand out.

Einsum Networks combine operations in monolithic einsum-operation

Implementation simplifies Expectation-Maximization using automatic differentiation

Einsum Networks scale to large datasets like SVHN and CelebA

🔎 Similar Papers

No similar papers found.