Towards symbolic regression for interpretable clinical decision scores

πŸ“… 2025-12-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Clinical risk scoring requires balancing interpretability with data-driven modeling, yet existing methods struggle to jointly capture rule-based logic and nonlinear functional forms. Method: We propose Brushβ€”the first algorithm that deeply integrates decision-tree-style splitting with symbolic regression (SR), overcoming SR’s traditional limitation to continuous functions. Brush introduces nonlinear constant optimization and Pareto-optimal search, enabling end-to-end training and evaluation on the SRBench benchmark. Contribution/Results: Brush accurately reconstructs established clinical scores (e.g., APACHE II, SOFA) with high fidelity and compact symbolic expressions. Compared to decision trees, random forests, and state-of-the-art SR methods, it achieves significantly enhanced interpretability while matching or exceeding predictive performance. By unifying symbolic rules and nonlinear functions in a traceable, standardized framework, Brush establishes a novel paradigm for developing clinically actionable, auditable diagnostic and therapeutic pathways.

Technology Category

Application Category

πŸ“ Abstract
Medical decision-making makes frequent use of algorithms that combine risk equations with rules, providing clear and standardized treatment pathways. Symbolic regression (SR) traditionally limits its search space to continuous function forms and their parameters, making it difficult to model this decision-making. However, due to its ability to derive data-driven, interpretable models, SR holds promise for developing data-driven clinical risk scores. To that end we introduce Brush, an SR algorithm that combines decision-tree-like splitting algorithms with non-linear constant optimization, allowing for seamless integration of rule-based logic into symbolic regression and classification models. Brush achieves Pareto-optimal performance on SRBench, and was applied to recapitulate two widely used clinical scoring systems, achieving high accuracy and interpretable models. Compared to decision trees, random forests, and other SR methods, Brush achieves comparable or superior predictive performance while producing simpler models.
Problem

Research questions and friction points this paper is trying to address.

Develop interpretable clinical decision scores using symbolic regression
Integrate rule-based logic into symbolic regression for medical algorithms
Recapitulate clinical scoring systems with high accuracy and simplicity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates decision-tree splitting with constant optimization
Enables rule-based logic in symbolic regression models
Achieves high accuracy with simpler interpretable models
πŸ”Ž Similar Papers
No similar papers found.
G
Guilherme Seidyo Imai Aldeia
Federal University of ABC, SP, Brazil
J
Joseph D. Romano
Institute for Biomedical Informatics, University of Pennsylvania, PA, US
F
Fabricio Olivetti de Franca
Federal University of ABC, SP, Brazil
D
Daniel S. Herman
Department of Pathology and Laboratory Medicine, University of Pennsylvania, PA, US
William G. La Cava
William G. La Cava
Harvard, Boston Children's Hospital
biomedical informaticsmachine learningfairnessinterpretabilitysymbolic regression