MemLineage: Lineage-Guided Enforcement for LLM Agent Memory

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This work addresses the risk of untrusted content persisting in large language model (LLM) agent memories and subsequently misleading sensitive operations. To mitigate this, the authors introduce MemLineage, the first framework to apply chain-of-custody principles to memory security. MemLineage cryptographically annotates each memory with its provenance and LLM-derived lineage, leveraging Merkle logs, Ed25519 signatures, and a weighted derivation directed acyclic graph. It incorporates an Untrusted-Path Persistence mechanism based on lineage-threshold propagation and a sensitive-action gating policy that precisely blocks malicious influences while preserving the recall of benign memories. Experimental results demonstrate that under three types of memory poisoning attacks, the attack success rate drops to 0%, with per-operation overhead below the millisecond level—significantly outperforming baseline approaches.

📝 Abstract

We introduce MemLineage, a defense for LLM agent memory that attaches both cryptographic provenance and LLM-mediated derivation lineage to every entry. Recent and concurrent work shows that untrusted content can be written into persistent agent state and re-enter later sessions as an instruction; the remaining systems question is how to preserve useful memory recall while preventing such state from justifying sensitive actions. MemLineage treats this as a chain-of-custody problem rather than a filtering problem. It is a six-module design around an RFC-6962 Merkle log over per-principal Ed25519-signed entries: a weighted derivation DAG records which retrieved entries influenced each new memory, and a max-of-strong-edges propagation rule makes Untrusted-Path Persistence hold for any chain whose attribution edges remain above threshold. The sensitive-action gate then refuses dispatches whose active justification descends from an external ancestor, while still allowing benign recall. We evaluate three defense cells against three memory-poisoning workloads on a deterministic mechanism-isolation harness; MemLineage is the only configuration in that harness that drives all three columns to zero ASR, while sub-millisecond per-operation overhead keeps it well below the noise floor of any LLM call. A Codex-backed AgentDojo bridge further separates strong-model behavior from defense-layer behavior: under an intentionally vulnerable tool-output profile, no-defense and signature-only baselines fail on all six banking pairs, while all MemLineage rows reduce strict AgentDojo ASR to zero. The core deterministic artifacts are byte-equal CI-verified; hosted-model AgentDojo and live-model sweeps are recorded as auditable logs rather than byte-pinned artifacts.

Problem

Research questions and friction points this paper is trying to address.

LLM agent memory

memory poisoning

untrusted content

sensitive actions

persistent state

Innovation

Methods, ideas, or system contributions that make the work stand out.

Merkle log

provenance tracking

memory poisoning defense