MemLineage: Lineage-Guided Enforcement for LLM Agent Memory

📅 2026-05-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

230K/year
🤖 AI Summary
This work addresses the risk of untrusted content persisting in large language model (LLM) agent memories and subsequently misleading sensitive operations. To mitigate this, the authors introduce MemLineage, the first framework to apply chain-of-custody principles to memory security. MemLineage cryptographically annotates each memory with its provenance and LLM-derived lineage, leveraging Merkle logs, Ed25519 signatures, and a weighted derivation directed acyclic graph. It incorporates an Untrusted-Path Persistence mechanism based on lineage-threshold propagation and a sensitive-action gating policy that precisely blocks malicious influences while preserving the recall of benign memories. Experimental results demonstrate that under three types of memory poisoning attacks, the attack success rate drops to 0%, with per-operation overhead below the millisecond level—significantly outperforming baseline approaches.
📝 Abstract
We introduce MemLineage, a defense for LLM agent memory that attaches both cryptographic provenance and LLM-mediated derivation lineage to every entry. Recent and concurrent work shows that untrusted content can be written into persistent agent state and re-enter later sessions as an instruction; the remaining systems question is how to preserve useful memory recall while preventing such state from justifying sensitive actions. MemLineage treats this as a chain-of-custody problem rather than a filtering problem. It is a six-module design around an RFC-6962 Merkle log over per-principal Ed25519-signed entries: a weighted derivation DAG records which retrieved entries influenced each new memory, and a max-of-strong-edges propagation rule makes Untrusted-Path Persistence hold for any chain whose attribution edges remain above threshold. The sensitive-action gate then refuses dispatches whose active justification descends from an external ancestor, while still allowing benign recall. We evaluate three defense cells against three memory-poisoning workloads on a deterministic mechanism-isolation harness; MemLineage is the only configuration in that harness that drives all three columns to zero ASR, while sub-millisecond per-operation overhead keeps it well below the noise floor of any LLM call. A Codex-backed AgentDojo bridge further separates strong-model behavior from defense-layer behavior: under an intentionally vulnerable tool-output profile, no-defense and signature-only baselines fail on all six banking pairs, while all MemLineage rows reduce strict AgentDojo ASR to zero. The core deterministic artifacts are byte-equal CI-verified; hosted-model AgentDojo and live-model sweeps are recorded as auditable logs rather than byte-pinned artifacts.
Problem

Research questions and friction points this paper is trying to address.

LLM agent memory
memory poisoning
untrusted content
sensitive actions
persistent state
Innovation

Methods, ideas, or system contributions that make the work stand out.

Merkle log
provenance tracking
memory poisoning defense
LLM agent security
derivation lineage
🔎 Similar Papers