Discovering physical laws with parallel combinatorial tree search

📅 2024-07-05

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

Symbolic regression faces fundamental challenges in balancing formula simplicity, generalization capability, and search efficiency within an infinite expression space, severely limiting its applicability to scientific discovery. To address this, we propose the Parallel Compositional Tree Search (PCTS) framework, which jointly integrates syntax-tree structural constraints, distributed enumeration, semantics-aware priority scheduling, and a differentiable symbolic evaluator, complemented by a novel parallelized pruning mechanism. Evaluated on over 200 synthetic and real-world benchmark datasets, PCTS achieves up to 99% higher average accuracy and an order-of-magnitude speedup in inference time compared to state-of-the-art methods. It establishes a new paradigm for efficiently discovering interpretable physical laws from limited observational data, significantly advancing the practical utility of symbolic regression in scientific modeling.

Technology Category

Application Category

📝 Abstract

Symbolic regression plays a crucial role in modern scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data. A grand challenge lies in the arduous search for parsimonious and generalizable mathematical formulas, in an infinite search space, while intending to fit the training data. Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity, which essentially hinders the pace of applying symbolic regression for scientific exploration across interdisciplinary domains. To this end, we introduce a parallel combinatorial tree search (PCTS) model to efficiently distill generic mathematical expressions from limited data. Through a series of extensive experiments, we demonstrate the superior accuracy and efficiency of PCTS for equation discovery, which greatly outperforms the state-of-the-art baseline models on over 200 synthetic and experimental datasets (e.g., lifting its performance by up to 99% accuracy improvement and one-order of magnitude speed up). PCTS represents a key advance in accurate and efficient data-driven discovery of symbolic, interpretable models (e.g., underlying physical laws) and marks a pivotal transition towards scalable symbolic learning.

Problem

Research questions and friction points this paper is trying to address.

Efficiently discover parsimonious mathematical formulas from data.

Overcome accuracy and efficiency bottlenecks in symbolic regression.

Enable scalable and interpretable model discovery for scientific exploration.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel combinatorial tree search for symbolic regression

Efficient distillation of mathematical expressions from data

Superior accuracy and efficiency in equation discovery

🔎 Similar Papers

Physics-tailored machine learning reveals unexpected physics in dusty plasmas