Adapting the Interface, Not the Model: Runtime Harness Adaptation for Deterministic LLM Agents

📅 2026-05-21

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work addresses the frequent failures of large language model (LLM) agents in deterministic rule-based environments, which often stem from mismatches between the model and its environmental interface rather than inherent limitations in model capability. To resolve this, the authors propose Life-Harness, a lifecycle-aware runtime framework that shifts the focus of agent adaptation from model parameters to the runtime interface. By analyzing training trajectories, Life-Harness extracts reusable intervention strategies to construct fixed, transferable adapters—without modifying model weights or the environment itself. The framework operates across four dimensions: environment contracts, procedural skills, action implementation, and trajectory regulation. Evaluated across seven benchmarks and 126 model–environment pairings, it improves performance in 116 cases, achieving an average relative gain of 88.5%. Notably, a harness trained with Qwen3-4B-Instruct successfully transfers to 17 other models.

📝 Abstract

LLM agents are shaped not only by their language models, but also by the runtime harness that mediates observation, tool use, action execution, feedback interpretation, and trajectory control. While existing agent adaptation methods mainly update model parameters, many failures in deterministic, rule-governed domains stem from mismatches at the model--environment interface. We propose Life-Harness, a lifecycle-aware runtime harness that improves frozen LLM agents without changing model weights or evaluation environments. Life-Harness evolves from training trajectories by converting recurring interaction failures into reusable interventions across environment contracts, procedural skills, action realization, and trajectory regulation, and remains fixed during held-out evaluation. On seven deterministic environments from $τ$-bench, $τ^2$-bench, and AgentBench, Life-Harness improves 116 out of 126 model--environment settings across 18 model backbones, with an average relative improvement of 88.5%. Harnesses evolved only from Qwen3-4B-Instruct trajectories transfer to 17 other models, showing that Life-Harness captures reusable environment-side structure rather than model-specific behavior. These results position runtime interface adaptation as a complementary alternative to model-centric agent training. Code is available at GitHub.

Problem

Research questions and friction points this paper is trying to address.

LLM agents

runtime harness

interface adaptation

deterministic environments

model-environment mismatch

Innovation

Methods, ideas, or system contributions that make the work stand out.

runtime harness adaptation

deterministic LLM agents

interface optimization