🤖 AI Summary
Current agentic AI systems suffer from fragility and unreliability due to their reliance on large language models within control loops and heuristic-based safeguards. This work proposes Arbiter-K, a novel architecture that introduces a “governance-first” execution paradigm by treating underlying models as probabilistic processing units encapsulated within a deterministic neuro-symbolic kernel. Through a Semantic Instruction Set Architecture (Semantic ISA), the system translates probabilistic outputs into discrete, executable instructions. Security policies are embedded at the microarchitectural level, enabling runtime safety context management, construction of instruction dependency graphs, proactive taint propagation based on dataflow provenance, autonomous correction, and architectural rollback. Empirical evaluation on the OpenClaw and NanoBot benchmarks demonstrates that Arbiter-K intercepts 76%–95% of unsafe behaviors, achieving an absolute improvement of 92.79% over native policy baselines.
📝 Abstract
The transition of agentic AI from brittle prototypes to production systems is stalled by a pervasive crisis of craft. We suggest that the prevailing orchestration paradigm-delegating the system control loop to large language models and merely patching with heuristic guardrails-is the root cause of this fragility. Instead, we propose Arbiter-K, a Governance-First execution architecture that reconceptualizes the underlying model as a Probabilistic Processing Unit encapsulated by a deterministic, neuro-symbolic kernel. Arbiter-K implements a Semantic Instruction Set Architecture (ISA) to reify probabilistic messages into discrete instructions. This allows the kernel to maintain a Security Context Registry and construct an Instruction Dependency Graph at runtime, enabling active taint propagation based on the data-flow pedigree of each reasoning node. By leveraging this mechanism, Arbiter-K precisely interdicts unsafe trajectories at deterministic sinks (e.g., high-risk tool calls or unauthorized network egress) and enables autonomous execution correction and architectural rollback when security policies are triggered. Evaluations on OpenClaw and NanoBot demonstrate that Arbiter-K enforces security as a microarchitectural property, achieving 76% to 95% unsafe interception for a 92.79% absolute gain over native policies. The code is publicly available at https://github.com/cure-lab/ArbiterOS.