Compiled AI: Deterministic Code Generation for LLM-Based Workflow Automation

๐Ÿ“… 2026-04-06
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenges of low reliability, poor auditability, high cost, and security risks associated with runtime invocation of large language models (LLMs) in high-stakes enterprise workflows. To overcome these issues, we propose a novel โ€œcompiled AIโ€ paradigm, wherein LLMs generate executable code during compilation, eliminating the need for model calls at runtime and thereby ensuring deterministic execution. We present the first systematic application of this paradigm to high-risk scenarios, integrating constrained code generation, a four-stage verification pipeline, template-embedded business logic functions, and an operation-oriented evaluation framework to jointly achieve reliability, auditability, and security. Experiments demonstrate a 96% success rate on function-calling tasks with zero runtime token consumption; 80.0% and 80.4% accuracy on critical field extraction and line-item recognition in document intelligence tasks; and strong security performance, with 96.7% prompt injection detection accuracy and 87.5% static analysis precision without false positives.
๐Ÿ“ Abstract
We study compiled AI, a paradigm in which large language models generate executable code artifacts during a compilation phase, after which workflows execute deterministically without further model invocation. This paradigm has antecedents in prior work on declarative pipeline optimization (DSPy) and hybrid neural-symbolic planning (LLM+P); our contribution is a systems-oriented study of its application to high-stakes enterprise workflows, with particular emphasis on healthcare settings where reliability and auditability are critical. By constraining generation to narrow business-logic functions embedded in validated templates, compiled AI trades runtime flexibility for predictability, auditability, cost efficiency, and reduced security exposure. We introduce (i) a system architecture for constrained LLM-based code generation, (ii) a four-stage generation-and-validation pipeline that converts probabilistic model output into production-ready code artifacts, and (iii) an evaluation framework measuring operational metrics including token amortization, determinism, reliability, security, and cost. We evaluate on two task types: function-calling (BFCL, n=400) and document intelligence (DocILE, n=5,680 invoices). On function-calling, compiled AI achieves 96% task completion with zero execution tokens, breaking even with runtime inference at approximately 17 transactions and reducing token consumption by 57x at 1,000 transactions. On document intelligence, our Code Factory variant matches Direct LLM on key field extraction (KILE: 80.0%) while achieving the highest line item recognition accuracy (LIR: 80.4%). Security evaluation across 135 test cases demonstrates 96.7% accuracy on prompt injection detection and 87.5% on static code safety analysis with zero false positives.
Problem

Research questions and friction points this paper is trying to address.

compiled AI
deterministic execution
workflow automation
reliability
auditability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Compiled AI
Deterministic Execution
Constrained Code Generation
LLM-based Workflow Automation
Code Validation Pipeline
๐Ÿ”Ž Similar Papers
No similar papers found.
G
Geert Trooskens
XY.AI Labs, Palo Alto, CA
A
Aaron Karlsberg
XY.AI Labs, Palo Alto, CA
A
Anmol Sharma
XY.AI Labs, Palo Alto, CA
L
Lamara De Brouwer
XY.AI Labs, Palo Alto, CA
M
Max Van Puyvelde
Stanford University School of Medicine, Stanford, CA
Matthew Young
Matthew Young
Rutgers university
Analytic number theory
John Thickstun
John Thickstun
Assistant Professor, Cornell University
Machine LearningGenerative ModelsMusic TechnologyNatural Language Processing
G
Gil Alterovitz
Brigham and Womenโ€™s Hospital / Harvard Medical School, Boston, MA
W
Walter A. De Brouwer
Stanford University School of Medicine, Stanford, CA