🤖 AI Summary
Existing symbolic networks struggle to naturally generalize binary nonlinear operators (e.g., multiplication, division) to multi-operand settings, and their fixed architectures often lead to overfitting and high expression complexity. To address these limitations, we propose the Unified Symbolic Network (USN): (1) it models binary operators as nested unary operators, enabling a unified operator representation; (2) it replaces hand-crafted architectures with a Transformer-pretrained guided dynamic architecture search mechanism; and (3) it establishes a differentiable, complexity-controllable framework for symbolic expression learning. Evaluated on Standard Benchmarks and SRBench, USN achieves significant improvements in both fitting accuracy and symbolic solution recovery rate, while simultaneously reducing expression complexity. It consistently outperforms state-of-the-art symbolic networks and conventional symbolic regression methods across all metrics.
📝 Abstract
Symbolic Regression (SR) is a powerful technique for automatically discovering mathematical expressions from input data. Mainstream SR algorithms search for the optimal symbolic tree in a vast function space, but the increasing complexity of the tree structure limits their performance. Inspired by neural networks, symbolic networks have emerged as a promising new paradigm. However, most existing symbolic networks still face certain challenges: binary nonlinear operators ${ imes, div}$ cannot be naturally extended to multivariate operators, and training with fixed architecture often leads to higher complexity and overfitting. In this work, we propose a Unified Symbolic Network that unifies nonlinear binary operators into nested unary operators and define the conditions under which UniSymNet can reduce complexity. Moreover, we pre-train a Transformer model with a novel label encoding method to guide structural selection, and adopt objective-specific optimization strategies to learn the parameters of the symbolic network. UniSymNet shows high fitting accuracy, excellent symbolic solution rate, and relatively low expression complexity, achieving competitive performance on low-dimensional Standard Benchmarks and high-dimensional SRBench.