The Trap of Trajectory: Towards Understanding and Mitigating Spurious Correlations in Agentic Memory

📅 2026-05-10

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the susceptibility of large language model agents to spurious correlations introduced through their memory mechanisms, which can propagate erroneous reasoning in downstream tasks. The study presents the first systematic diagnosis of this issue, constructing a trajectory-level benchmark of spurious patterns grounded in causal structures. To mitigate such false dependencies during both memory writing and retrieval, the authors propose CAMEL—a lightweight, plug-and-play calibration method. CAMEL is compatible with diverse memory architectures and substantially reduces model reliance on spurious correlations across three distinct types of confounding patterns, while preserving or even enhancing performance on clean inputs. Furthermore, the approach demonstrates robustness against adaptive calibration attacks.

📝 Abstract

Agentic memory enables LLMs to persist information beyond a single context window and reuse it in later decisions, but it also introduces a new vulnerability: spurious correlations, where retrieved memory carries miscorrelated evidence and propagates erroneous reasoning into downstream decisions. Despite the widespread use of agentic memory, this risk remains largely underexplored. We address it from two aspects. First, we benchmark several canonical types of spurious patterns identified through causal structure and record them across trajectory-level memory. Diagnosing agentic memory systems on this benchmark reveals that memory improves reasoning on clean inputs but amplifies reliance on spurious patterns when they are present. Second, we propose CAMEL, a plug-and-play calibration method that operates across diverse memory architectures at both write and retrieval time. CAMEL consistently reduces reliance on spurious patterns across all three types while preserving or improving performance on clean inputs and staying robust under adaptive attacks targeting the calibration. Overall, CAMEL offers a principled and lightweight solution toward more reliable agentic memory deployment.

Problem

Research questions and friction points this paper is trying to address.

spurious correlations

agentic memory

trajectory

memory reliability

erroneous reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

spurious correlations

agentic memory

CAMEL