DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

📅 2025-02-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The lack of interpretable models for scientific visual data hinders mechanistic understanding and scientific discovery. Method: This paper proposes an *interpretability-by-design* paradigm for program learning, automatically synthesizing Python programs that reveal underlying physical mechanisms. We introduce the first neuro-symbolic program synthesis framework integrating large language model (LLM) priors with evolutionary search, augmented by a program evaluator and a semantic simplifier to jointly optimize accuracy, readability, and physical interpretability in an end-to-end manner. Contribution/Results: Evaluated on three real-world scientific tasks—including population density estimation—our method achieves state-of-the-art performance: 35% lower error than the best black-box model, while all synthesized programs are fully human-readable and semantically grounded in domain physics. This work bridges a critical gap between automated modeling and rigorous interpretability in scientific AI.

Technology Category

Application Category

📝 Abstract
Visual data is used in numerous different scientific workflows ranging from remote sensing to ecology. As the amount of observation data increases, the challenge is not just to make accurate predictions but also to understand the underlying mechanisms for those predictions. Good interpretation is important in scientific workflows, as it allows for better decision-making by providing insights into the data. This paper introduces an automatic way of obtaining such interpretable-by-design models, by learning programs that interleave neural networks. We propose DiSciPLE (Discovering Scientific Programs using LLMs and Evolution) an evolutionary algorithm that leverages common sense and prior knowledge of large language models (LLMs) to create Python programs explaining visual data. Additionally, we propose two improvements: a program critic and a program simplifier to improve our method further to synthesize good programs. On three different real-world problems, DiSciPLE learns state-of-the-art programs on novel tasks with no prior literature. For example, we can learn programs with 35% lower error than the closest non-interpretable baseline for population density estimation.
Problem

Research questions and friction points this paper is trying to address.

Learning interpretable scientific programs
Improving visual data understanding
Reducing error in population estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evolutionary algorithm with LLMs
Program critic and simplifier
State-of-the-art interpretable models
🔎 Similar Papers
No similar papers found.