π€ AI Summary
Current LLM pipelines treat prompts as unstructured strings, suffering from opacity, poor reusability, and lack of runtime controllability. This paper introduces SPEARβthe first framework to elevate prompts to structured, programmable, and versioned first-class citizens. Its core innovations are (1) a prompt algebra enabling composability, abstraction, and optimization; and (2) a versioned view mechanism supporting dynamic adaptation and fine-grained observability. SPEAR integrates operator fusion, prefix caching, and view reuse, and supports three optimization modes: manual, assisted, and fully automatic. Experiments demonstrate that SPEAR significantly outperforms static prompting and retry-based baselines, improving response quality by +12.7% in BLEU/accuracy and reducing inference latency by 38%. This work is the first to empirically validate the feasibility and effectiveness of structured, prompt-level optimization.
π Abstract
Modern LLM pipelines increasingly resemble data-centric systems: they retrieve external context, compose intermediate outputs, validate results, and adapt based on runtime feedback. Yet, the central element guiding this process -- the prompt -- remains a brittle, opaque string, disconnected from the surrounding dataflow. This disconnect limits reuse, optimization, and runtime control.
In this paper, we describe our vision and an initial design for SPEAR, a language and runtime that fills this prompt management gap by making prompts structured, adaptive, and first-class components of the execution model. SPEAR enables (1) runtime prompt refinement -- modifying prompts dynamically in response to execution-time signals such as confidence, latency, or missing context; and (2) structured prompt management -- organizing prompt fragments into versioned views with support for introspection and logging.
SPEAR defines a prompt algebra that governs how prompts are constructed and adapted within a pipeline. It supports multiple refinement modes (manual, assisted, and automatic), giving developers a balance between control and automation. By treating prompt logic as structured data, SPEAR enables optimizations such as operator fusion, prefix caching, and view reuse. Preliminary experiments quantify the behavior of different refinement modes compared to static prompts and agentic retries, as well as the impact of prompt-level optimizations such as operator fusion.