Symbolic Density Estimation: A Decompositional Approach

📅 2026-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the AI-Kolmogorov framework, which introduces symbolic regression systematically into probability density estimation to address the Symbolic Density Estimation (SymDE) problem. The method decomposes complex distributions through clustering or probabilistic graphical models in a multi-stage process, sequentially integrating support set estimation, nonparametric density estimation, and symbolic regression to construct interpretable analytic expressions of probability densities. Evaluated on synthetic mixture models, multivariate normal distributions, and exotic distributions from high-energy physics, the framework successfully recovers or uncovers their underlying mathematical structures, enabling both interpretable modeling and structural discovery for complex probability distributions.
📝 Abstract
We introduce AI-Kolmogorov, a novel framework for Symbolic Density Estimation (SymDE). Symbolic regression (SR) has been effectively used to produce interpretable models in standard regression settings but its applicability to density estimation tasks has largely been unexplored. To address the SymDE task we introduce a multi-stage pipeline: (i) problem decomposition through clustering and/or probabilistic graphical model structure learning; (ii) nonparametric density estimation; (iii) support estimation; and finally (iv) SR on the density estimate. We demonstrate the efficacy of AI-Kolmogorov on synthetic mixture models, multivariate normal distributions, and three exotic distributions, two of which are motivated by applications in high-energy physics. We show that AI-Kolmogorov can discover underlying distributions or otherwise provide valuable insight into the mathematical expressions describing them.
Problem

Research questions and friction points this paper is trying to address.

Symbolic Density Estimation
Symbolic Regression
Density Estimation
Interpretable Models
Probabilistic Modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Symbolic Density Estimation
Symbolic Regression
Problem Decomposition
Nonparametric Density Estimation
Interpretable Modeling
A
Angelo Rajendram
Department of Mathematics, University of Waterloo, Waterloo, Canada
X
Xieting Chu
Department of Computer Science, Georgia Tech, Atlanta, USA
Vijay Ganesh
Vijay Ganesh
Professor, Georgia Institute of Technology, Atlanta, GA, USA
SAT/SMT SolversAIsoftware engineeringmathematical logicquantum foundations
M
Max Fieg
University of California, Irvine, Irvine, USA
A
Aishik Ghosh
Department of Computer Science, Georgia Tech, Atlanta, USA