SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Although symbolic distillation can transform neural network components into interpretable closed-form mathematical expressions, its engineering complexity has hindered widespread adoption in deep learning. This work proposes SymTorch, a library that enables general-purpose symbolic distillation for mainstream architectures—including GNNs, PINNs, and Transformers—for the first time. By capturing input-output behaviors of network components and leveraging PySR for symbolic regression, SymTorch automatically generates human-readable expressions while supporting seamless switching between neural and symbolic forward propagation. Key technical contributions include optimized GPU-CPU data transfer, caching mechanisms, and model serialization. Experiments demonstrate that replacing MLP layers in large language models with their symbolic surrogates improves inference throughput by 8.3% with negligible performance degradation.

Technology Category

Application Category

📝 Abstract

Symbolic distillation replaces neural networks, or components thereof, with interpretable, closed-form mathematical expressions. This approach has shown promise in discovering physical laws and mathematical relationships directly from trained deep learning models, yet adoption remains limited due to the engineering barrier of integrating symbolic regression into deep learning workflows. We introduce SymTorch, a library that automates this distillation by wrapping neural network components, collecting their input-output behavior, and approximating them with human-readable equations via PySR. SymTorch handles the engineering challenges that have hindered adoption: GPU-CPU data transfer, input-output caching, model serialization, and seamless switching between neural and symbolic forward passes. We demonstrate SymTorch across diverse architectures including GNNs, PINNs and transformer models. Finally, we present a proof-of-concept for accelerating LLM inference by replacing MLP layers with symbolic surrogates, achieving an 8.3\% throughput improvement with moderate performance degradation.

Problem

Research questions and friction points this paper is trying to address.

symbolic distillation

symbolic regression

deep neural networks

interpretable models

engineering barrier

Innovation

Methods, ideas, or system contributions that make the work stand out.

symbolic distillation

interpretable AI

PySR