Diffusion-Based Symbolic Regression

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of balancing formula diversity and accuracy in symbolic regression by proposing the first diffusion-based equation generation framework. Methodologically, it introduces a stochastic masked diffusion-denoising mechanism for structured symbolic expression generation and synergistically combines token-level Grouped Relative Policy Optimization (GRPO) with a long-term risk-seeking reinforcement learning strategy to search for optimal solutions. Key contributions include: (1) the first application of diffusion models to symbolic regression, breaking away from conventional tree- or sequence-based generation paradigms; (2) the novel integration of GRPO and risk-seeking exploration, substantially enhancing both formula diversity and generalization capability; and (3) consistent state-of-the-art performance across multiple benchmark datasets, with ablation studies confirming the effectiveness of each component.

Technology Category

Application Category

📝 Abstract
Diffusion has emerged as a powerful framework for generative modeling, achieving remarkable success in applications such as image and audio synthesis. Enlightened by this progress, we propose a novel diffusion-based approach for symbolic regression. We construct a random mask-based diffusion and denoising process to generate diverse and high-quality equations. We integrate this generative processes with a token-wise Group Relative Policy Optimization (GRPO) method to conduct efficient reinforcement learning on the given measurement dataset. In addition, we introduce a long short-term risk-seeking policy to expand the pool of top-performing candidates, further enhancing performance. Extensive experiments and ablation studies have demonstrated the effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Proposes diffusion-based symbolic regression for equation generation
Integrates generative process with GRPO for efficient reinforcement learning
Introduces risk-seeking policy to improve top candidate diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random mask-based diffusion for equation generation
Token-wise GRPO for efficient reinforcement learning
Risk-seeking policy to expand top candidates
🔎 Similar Papers
No similar papers found.
Z
Zachary Bastiani
Kahlert School of Computing, Scientific Computing and Imaging Institute, University of Utah
R
Robert M. Kirby
Kahlert School of Computing, Scientific Computing and Imaging Institute, University of Utah
Jacob Hochhalter
Jacob Hochhalter
University of Utah
Mechanical behavior of materialsInterpretable machine learningHigh performance computingDigital image correlation
Shandian Zhe
Shandian Zhe
School of Computing, University of Utah
Probabilistic Machine Learning