Decomposable Neuro Symbolic Regression

📅 2025-11-06

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Existing symbolic regression methods overly prioritize minimizing prediction error, often failing to recover the true governing equations of dynamical systems and yielding redundant or structurally distorted expressions. To address this, we propose a neural-symbolic fusion framework for decomposable symbolic regression. First, multiple Multi-Set Transformers generate univariate symbolic skeletons—ensuring structural interpretability and modularity. Subsequently, a synergistic optimization combining genetic algorithms and genetic programming performs skeleton selection and formula synthesis. Our method achieves superior or competitive performance in noise robustness, interpolation, and extrapolation accuracy compared to state-of-the-art approaches. Crucially, it is the first to consistently and stably recover the original mathematical structure of governing equations—enabling faithful representation of underlying physical mechanisms. This advancement significantly enhances both physical interpretability and modeling reliability.

Technology Category

Application Category

📝 Abstract

Symbolic regression (SR) models complex systems by discovering mathematical expressions that capture underlying relationships in observed data. However, most SR methods prioritize minimizing prediction error over identifying the governing equations, often producing overly complex or inaccurate expressions. To address this, we present a decomposable SR method that generates interpretable multivariate expressions leveraging transformer models, genetic algorithms (GAs), and genetic programming (GP). In particular, our explainable SR method distills a trained ``opaque''regression model into mathematical expressions that serve as explanations of its computed function. Our method employs a Multi-Set Transformer to generate multiple univariate symbolic skeletons that characterize how each variable influences the opaque model's response. We then evaluate the generated skeletons'performance using a GA-based approach to select a subset of high-quality candidates before incrementally merging them via a GP-based cascade procedure that preserves their original skeleton structure. The final multivariate skeletons undergo coefficient optimization via a GA. We evaluated our method on problems with controlled and varying degrees of noise, demonstrating lower or comparable interpolation and extrapolation errors compared to two GP-based methods, three neural SR methods, and a hybrid approach. Unlike them, our approach consistently learned expressions that matched the original mathematical structure.

Problem

Research questions and friction points this paper is trying to address.

Generating interpretable mathematical expressions from data

Overcoming overly complex symbolic regression solutions

Distilling opaque regression models into explainable functions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer generates univariate symbolic skeletons

Genetic algorithm selects high-quality candidate expressions

Genetic programming merges skeletons preserving structure

🔎 Similar Papers

No similar papers found.