Symbolic Density Estimation for Discrete Distributions

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

217K/year
🤖 AI Summary
This work addresses the challenge that traditional discrete probability distributions rely on manually derived analytical forms, hindering the automatic discovery of interpretable models. We propose Symbolic Density Estimation (SDE), a novel framework that, for the first time, integrates structural priors, evolutionary search, and validity-aware parameter inference to automatically discover closed-form probability mass functions within a structured symbolic space composed of elementary mathematical operations. SDE accommodates complex distributional features such as zero-inflation and finite mixtures. We introduce the first systematic benchmark dataset for this task and demonstrate that SDE accurately recovers all target distribution families. On real-world data, SDE discovers concise, interpretable mixture models that achieve superior goodness-of-fit compared to standard methods.
📝 Abstract
Discrete probability laws underpin statistical modeling, yet the catalog of interpretable distributions has expanded only gradually through centuries of case-by-case mathematical derivations. We introduce symbolic density estimation (SDE), an unsupervised framework that automatically recovers closed-form probability mass functions by composing elementary analytic operations within a structured search space. Our method integrates domain-specific structural priors with evolutionary search and a validity-aware inference stage, and it extends to richer distribution families such as zero inflation and finite mixtures. To support systematic evaluation and future research, we contribute a benchmark dataset spanning a broad collection of commonly used discrete distributions. The proposed algorithm recovers all benchmark families with accurate parameter estimates. A real data application shows that it identifies concise and interpretable mixture models that improve goodness-of-fit over standard models.
Problem

Research questions and friction points this paper is trying to address.

discrete distributions
symbolic density estimation
interpretable models
probability mass functions
unsupervised learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

symbolic density estimation
discrete distributions
evolutionary search
interpretable models
closed-form PMF
🔎 Similar Papers
No similar papers found.