Kintsugi: Learning Policies by Repairing Executable Knowledge Bases

📅 2026-05-10

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Current embodied agents typically encode their policies as black-box representations within neural networks or prompts, which are difficult to inspect, reuse, or compose. This work proposes a white-box policy learning framework that represents policies as typed, executable knowledge entries. The approach leverages trajectory replay to drive localized policy edits and incorporates a verification-gated mechanism to ensure correctness. Notably, it eliminates the need for large language model queries during inference by relying on a tool-constrained intelligent editing loop and a deterministic symbolic executor. This enables interpretable, editable, and verifiable policy evolution. Evaluated on long-horizon text-based agents and object-centric manipulation tasks, the framework maintains high performance while significantly improving policy inspectability, local editability, and deployment safety.

📝 Abstract

Modern embodied agents achieve impressive performance, but their task knowledge is often stored in neural weights, latent state, or prompt-bound memory, making individual policy knowledge difficult to inspect, validate, recombine, and reuse. We introduce \textbf{Kintsugi}, a white-box policy-learning framework that treats embodied policy improvement as verifier-gated construction of a typed executable Knowledge Base (KB). Kintsugi represents task-level policy knowledge as composable typed entries -- predicates, operators, policy schemas, monitors, recovery rules, experience records, and goals -- and improves this artifact through localized typed edits induced from rollout evidence, rather than relying on test-time language-model reasoning. Between rollouts, a tool-constrained agentic editing loop diagnoses trajectory failures, localizes them to editable KB layers, and proposes candidate edits. A deterministic verification gate admits an edit only when the candidate type-checks, the resulting KB executes, and focused validation success or trajectory-health metrics improve without violating protected-regression checks. At inference, the accepted KB is executed by a deterministic symbolic executor with zero LLM calls. Across long-horizon text-agent benchmarks and representative object-centric manipulation settings, Kintsugi achieves strong endpoint performance while preserving inspectability, local editability, and verifier-gated deployment. These results suggest that embodied policy improvement can be organized around executable task knowledge.

Problem

Research questions and friction points this paper is trying to address.

embodied agents

policy knowledge

inspectability

reusability

executable knowledge

Innovation

Methods, ideas, or system contributions that make the work stand out.

executable knowledge base

white-box policy learning

verifier-gated editing