Al-Khwarizmi: Discovering Physical Laws with Foundation Models

📅 2025-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated discovery of interpretable physical laws remains challenging: conventional SINDy methods rely heavily on expert-crafted candidate function libraries and optimization strategies, resulting in poor generalizability. This paper introduces the first intelligent agent framework integrating large language models (LLMs), vision-language models (VLMs), and retrieval-augmented generation (RAG) to enable physics-informed prior injection, multimodal observational understanding (text, time-series data, and images), and automatic candidate library generation. The framework constructs interpretable dynamical models via L1-regularized sparse regression coupled with iterative reflective refinement. Evaluated on 198 benchmark dynamical systems, it achieves state-of-the-art performance—improving accuracy by 20% over the best baseline—while substantially reducing dependence on domain expertise. This work advances the automation and universality of physical law discovery.

Technology Category

Application Category

📝 Abstract
Inferring physical laws from data is a central challenge in science and engineering, including but not limited to healthcare, physical sciences, biosciences, social sciences, sustainability, climate, and robotics. Deep networks offer high-accuracy results but lack interpretability, prompting interest in models built from simple components. The Sparse Identification of Nonlinear Dynamics (SINDy) method has become the go-to approach for building such modular and interpretable models. SINDy leverages sparse regression with L1 regularization to identify key terms from a library of candidate functions. However, SINDy's choice of candidate library and optimization method requires significant technical expertise, limiting its widespread applicability. This work introduces Al-Khwarizmi, a novel agentic framework for physical law discovery from data, which integrates foundational models with SINDy. Leveraging LLMs, VLMs, and Retrieval-Augmented Generation (RAG), our approach automates physical law discovery, incorporating prior knowledge and iteratively refining candidate solutions via reflection. Al-Khwarizmi operates in two steps: it summarizes system observations-comprising textual descriptions, raw data, and plots-followed by a secondary step that generates candidate feature libraries and optimizer configurations to identify hidden physics laws correctly. Evaluating our algorithm on over 198 models, we demonstrate state-of-the-art performance compared to alternatives, reaching a 20 percent increase against the best-performing alternative.
Problem

Research questions and friction points this paper is trying to address.

Automating physical law discovery
Enhancing SINDy method usability
Integrating foundational models with SINDy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates foundational models with SINDy
Uses LLMs, VLMs, and RAG for automation
Generates candidate libraries and optimizers
🔎 Similar Papers
No similar papers found.