Making Prompts First-Class Citizens for Adaptive LLM Pipelines

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Current LLM pipelines treat prompts as unstructured strings, suffering from opacity, poor reusability, and lack of runtime controllability. This paper introduces SPEAR—the first framework to elevate prompts to structured, programmable, and versioned first-class citizens. Its core innovations are (1) a prompt algebra enabling composability, abstraction, and optimization; and (2) a versioned view mechanism supporting dynamic adaptation and fine-grained observability. SPEAR integrates operator fusion, prefix caching, and view reuse, and supports three optimization modes: manual, assisted, and fully automatic. Experiments demonstrate that SPEAR significantly outperforms static prompting and retry-based baselines, improving response quality by +12.7% in BLEU/accuracy and reducing inference latency by 38%. This work is the first to empirically validate the feasibility and effectiveness of structured, prompt-level optimization.

Technology Category

Application Category

📝 Abstract

Modern LLM pipelines increasingly resemble data-centric systems: they retrieve external context, compose intermediate outputs, validate results, and adapt based on runtime feedback. Yet, the central element guiding this process -- the prompt -- remains a brittle, opaque string, disconnected from the surrounding dataflow. This disconnect limits reuse, optimization, and runtime control. In this paper, we describe our vision and an initial design for SPEAR, a language and runtime that fills this prompt management gap by making prompts structured, adaptive, and first-class components of the execution model. SPEAR enables (1) runtime prompt refinement -- modifying prompts dynamically in response to execution-time signals such as confidence, latency, or missing context; and (2) structured prompt management -- organizing prompt fragments into versioned views with support for introspection and logging. SPEAR defines a prompt algebra that governs how prompts are constructed and adapted within a pipeline. It supports multiple refinement modes (manual, assisted, and automatic), giving developers a balance between control and automation. By treating prompt logic as structured data, SPEAR enables optimizations such as operator fusion, prefix caching, and view reuse. Preliminary experiments quantify the behavior of different refinement modes compared to static prompts and agentic retries, as well as the impact of prompt-level optimizations such as operator fusion.

Problem

Research questions and friction points this paper is trying to address.

Making prompts structured and adaptive in LLM pipelines

Enabling runtime prompt refinement based on execution signals

Supporting structured prompt management with versioning and logging

Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured prompt management with versioned views

Runtime prompt refinement based on execution signals

Prompt algebra enabling construction and adaptation

🔎 Similar Papers

PhaseEvo: Towards Unified In-Context Prompt Optimization for Large Language Models