Sparse Interpretable Deep Learning with LIES Networks for Symbolic Regression

📅 2025-06-09

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing symbolic regression (SR) methods suffer from poor scalability, inconsistent symbolic representations, and redundant expressions. To address these issues, this paper proposes LIES: an end-to-end differentiable, fixed-architecture neural network embedding four interpretable primitive functions—Log, Identity, Exp, and Sin—to enforce symbolic prior consistency. LIES integrates oversampling-based training, a sparsity-stabilized loss function, and gradient-stabilized optimization, while incorporating heuristic expression extraction and post-training pruning to jointly optimize both expression simplicity and fidelity. On standard SR benchmarks, LIES consistently outperforms state-of-the-art baselines, producing significantly shorter expressions with lower approximation error. Ablation studies confirm the critical contributions of LIES’s architecture, sparse loss formulation, and pruning module to interpretability, generalization, and overall SR performance.

Technology Category

Application Category

📝 Abstract

Symbolic regression (SR) aims to discover closed-form mathematical expressions that accurately describe data, offering interpretability and analytical insight beyond standard black-box models. Existing SR methods often rely on population-based search or autoregressive modeling, which struggle with scalability and symbolic consistency. We introduce LIES (Logarithm, Identity, Exponential, Sine), a fixed neural network architecture with interpretable primitive activations that are optimized to model symbolic expressions. We develop a framework to extract compact formulae from LIES networks by training with an appropriate oversampling strategy and a tailored loss function to promote sparsity and to prevent gradient instability. After training, it applies additional pruning strategies to further simplify the learned expressions into compact formulae. Our experiments on SR benchmarks show that the LIES framework consistently produces sparse and accurate symbolic formulae outperforming all baselines. We also demonstrate the importance of each design component through ablation studies.

Problem

Research questions and friction points this paper is trying to address.

Develop interpretable neural networks for symbolic regression

Overcome scalability and consistency issues in existing methods

Extract compact formulae via sparsity-promoting training and pruning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fixed neural network with interpretable activations

Oversampling and tailored loss for sparsity

Pruning strategies to simplify expressions

🔎 Similar Papers

No similar papers found.