From Specification to Architecture: A Theory Compiler for Knowledge-Guided Machine Learning

📅 2026-03-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing approaches that manually translate domain theories into neural network architectures suffer from limited generality, verifiability, and scalability, often failing to guarantee strict alignment between models and prior knowledge. This work proposes a theory compiler that, for the first time, enables automatic mapping from formalized, typed domain theories to provably correct neural architectures. The method integrates formal methods, type systems, neural architecture design, and statistical learning theory through a universal theory language, a compositional compilation algorithm, and formal verification criteria, with large language models assisting in end-to-end compilation. The resulting architectures are theoretically guaranteed to achieve generalization performance comparable to or better than handcrafted designs while substantially reducing reliance on large training datasets.

Technology Category

Application Category

📝 Abstract
Theory-guided machine learning has demonstrated that including authentic domain knowledge directly into model design improves performance, sample efficiency and out-of-distribution generalisation. Yet the process by which a formal domain theory is translated into architectural constraints remains entirely manual, specific to each domain formalism, and devoid of any formal correctness guarantee. This translation is non-transferable between domains, not verified, and does not scale. We propose the Theory Compiler: a system that accepts a typed, machine-readable domain theory as input and automatically produces an architecture whose function space is provably constrained to be consistent with that theory by construction, not by regularisation. We identify three foundational open problems whose resolution defines our research agenda: (1) designing a universal theory formalisation language with decidable type-checking; (2) constructing a compositionally correct compilation algorithm from theory primitives to architectural modules; and (3) establishing soundness and completeness criteria for formal verification. We further conjecture that compiled architectures match or exceed manually-designed counterparts in generalisation performance while requiring substantially less training data, a claim we ground in classical statistical learning theory. We argue that recent advances in formal machine learning theory, large language models, and the growth of an interdisciplinary research community have made this paradigm achievable for the first time.
Problem

Research questions and friction points this paper is trying to address.

theory-guided machine learning
domain theory
architecture compilation
formal verification
knowledge integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Theory Compiler
Knowledge-Guided Machine Learning
Formal Verification
Architecture Compilation
Typed Domain Theory
A
Asela Hevapathige
AI, Optimization and Pattern Recognition Research Group, Faculty of Engineering and Information Technology, University of Melbourne, Melbourne, Australia
Yu Xia
Yu Xia
Research Fellow, The University of Melbourne
machine learning
Sachith Seneviratne
Sachith Seneviratne
Research Fellow in Computer Vision, University Of Melbourne
Machine LearningComputer VisionNatural Language ProcessingUrban Informatics
S
Saman Halgamuge
AI, Optimization and Pattern Recognition Research Group, Faculty of Engineering and Information Technology, University of Melbourne, Melbourne, Australia