In Praise of Stubbornness: The Case for Cognitive-Dissonance-Aware Knowledge Updates in LLMs

📅 2025-02-05

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

LLMs suffer from catastrophic forgetting during continual knowledge updating and struggle to identify and resolve conflicts between new and existing knowledge—unlike humans. This paper introduces cognitive dissonance theory into LLM knowledge updating for the first time, proposing a conflict-aware update paradigm grounded in human cognitive mechanisms. Specifically, it detects information inconsistency via activation and gradient features, tracks neuron activation dynamics to distinguish “rigid” from “plastic” parameters, and applies targeted parameter updates accordingly. Experiments show that inconsistency can be efficiently detected with minimal overhead; non-inconsistent updates preserve prior knowledge almost perfectly, whereas inconsistent updates induce global degradation of unrelated knowledge—revealing a fundamental structural fragility in current LLM knowledge representations. This work establishes a novel paradigm and empirical foundation for building robust, evolvable language model knowledge architectures.

Technology Category

Application Category

📝 Abstract

Despite remarkable capabilities, large language models (LLMs) struggle to continually update their knowledge without catastrophic forgetting. In contrast, humans effortlessly integrate new information, detect conflicts with existing beliefs, and selectively update their mental models. This paper introduces a cognitive-inspired investigation paradigm to study continual knowledge updating in LLMs. We implement two key components inspired by human cognition: (1) Dissonance and Familiarity Awareness, analyzing model behavior to classify information as novel, familiar, or dissonant; and (2) Targeted Network Updates, which track neural activity to identify frequently used (stubborn) and rarely used (plastic) neurons. Through carefully designed experiments in controlled settings, we uncover a number of empirical findings demonstrating the potential of this approach. First, dissonance detection is feasible using simple activation and gradient features, suggesting potential for cognitive-inspired training. Second, we find that non-dissonant updates largely preserve prior knowledge regardless of targeting strategy, revealing inherent robustness in LLM knowledge integration. Most critically, we discover that dissonant updates prove catastrophically destructive to the model's knowledge base, indiscriminately affecting even information unrelated to the current updates. This suggests fundamental limitations in how neural networks handle contradictions and motivates the need for new approaches to knowledge updating that better mirror human cognitive mechanisms.

Problem

Research questions and friction points this paper is trying to address.

Cognitive-dissonance-aware updates in LLMs

Preventing catastrophic forgetting in knowledge integration

Human-like selective knowledge updating in neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dissonance and Familiarity Awareness

Targeted Network Updates

Cognitive-inspired training approach

🔎 Similar Papers

ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence