Sparse Interpretable Deep Learning with LIES Networks for Symbolic Regression

📅 2025-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing symbolic regression (SR) methods suffer from poor scalability, inconsistent symbolic representations, and redundant expressions. To address these issues, this paper proposes LIES: an end-to-end differentiable, fixed-architecture neural network embedding four interpretable primitive functions—Log, Identity, Exp, and Sin—to enforce symbolic prior consistency. LIES integrates oversampling-based training, a sparsity-stabilized loss function, and gradient-stabilized optimization, while incorporating heuristic expression extraction and post-training pruning to jointly optimize both expression simplicity and fidelity. On standard SR benchmarks, LIES consistently outperforms state-of-the-art baselines, producing significantly shorter expressions with lower approximation error. Ablation studies confirm the critical contributions of LIES’s architecture, sparse loss formulation, and pruning module to interpretability, generalization, and overall SR performance.

Technology Category

Application Category

📝 Abstract
Symbolic regression (SR) aims to discover closed-form mathematical expressions that accurately describe data, offering interpretability and analytical insight beyond standard black-box models. Existing SR methods often rely on population-based search or autoregressive modeling, which struggle with scalability and symbolic consistency. We introduce LIES (Logarithm, Identity, Exponential, Sine), a fixed neural network architecture with interpretable primitive activations that are optimized to model symbolic expressions. We develop a framework to extract compact formulae from LIES networks by training with an appropriate oversampling strategy and a tailored loss function to promote sparsity and to prevent gradient instability. After training, it applies additional pruning strategies to further simplify the learned expressions into compact formulae. Our experiments on SR benchmarks show that the LIES framework consistently produces sparse and accurate symbolic formulae outperforming all baselines. We also demonstrate the importance of each design component through ablation studies.
Problem

Research questions and friction points this paper is trying to address.

Develop interpretable neural networks for symbolic regression
Overcome scalability and consistency issues in existing methods
Extract compact formulae via sparsity-promoting training and pruning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fixed neural network with interpretable activations
Oversampling and tailored loss for sparsity
Pruning strategies to simplify expressions
🔎 Similar Papers
No similar papers found.
M
Mansooreh Montazerin
Department of Electrical & Computer Engineering, University of Southern California
M
Majd Al Aawar
Department of Electrical & Computer Engineering, University of Southern California
Antonio Ortega
Antonio Ortega
Dean's Professor of Electrical and Computer Engineering, University of Southern California
Signal ProcessingGraph Signal Processing
Ajitesh Srivastava
Ajitesh Srivastava
Research Assistant Professor, University of Southern California
Graph AnalyticsMachine LearningEpidemicsSocial GoodArchitecture