Why Trust in AI May Be Inevitable

📅 2025-02-28

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This paper challenges the “explanation-driven trust” paradigm, arguing that formal explanations for increasingly complex AI—particularly large language models (LLMs)—may be fundamentally unattainable. Method: It formalizes explanation as constrained path search within a knowledge network and proves that even under ideal conditions, bounded-rational agents cannot always identify an explainable path, forcing reliance on trust as a cognitive substitute. Contribution/Results: First, it formally characterizes the cognitive boundaries of explanation, establishing trust as epistemically necessary—not merely pragmatic—in human-AI interaction. Second, it exposes the risk that LLMs generate superficially plausible but semantically false explanations, exacerbating trust misalignment and undermining knowledge integration. Third, it provides a foundational theoretical constraint for trustworthy AI design: trust is irreducible, and systems must explicitly incorporate mechanisms to detect and mitigate trust substitution—embedding such capabilities directly into architectural design rather than treating trust as an afterthought.

Technology Category

Application Category

📝 Abstract

In human-AI interactions, explanation is widely seen as necessary for enabling trust in AI systems. We argue that trust, however, may be a pre-requisite because explanation is sometimes impossible. We derive this result from a formalization of explanation as a search process through knowledge networks, where explainers must find paths between shared concepts and the concept to be explained, within finite time. Our model reveals that explanation can fail even under theoretically ideal conditions - when actors are rational, honest, motivated, can communicate perfectly, and possess overlapping knowledge. This is because successful explanation requires not just the existence of shared knowledge but also finding the connection path within time constraints, and it can therefore be rational to cease attempts at explanation before the shared knowledge is discovered. This result has important implications for human-AI interaction: as AI systems, particularly Large Language Models, become more sophisticated and able to generate superficially compelling but spurious explanations, humans may default to trust rather than demand genuine explanations. This creates risks of both misplaced trust and imperfect knowledge integration.

Problem

Research questions and friction points this paper is trying to address.

Trust in AI may precede explanation due to explanation impossibility.

Explanation fails even under ideal conditions due to time constraints.

Humans may default to trust over genuine explanations, risking errors.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Formalizes explanation as knowledge network search

Identifies trust as prerequisite over explanation

Highlights risks of trust without genuine understanding

🔎 Similar Papers

No similar papers found.