What Counts as AI Sycophancy? A Taxonomy and Expert Survey of a Fragmented Construct

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This study addresses the lack of a unified definition of AI flattery, which has impeded both research and governance efforts. Through a systematic literature review and a large-scale expert survey, the authors propose the first classification framework that jointly considers the target of flattery—either beliefs or personal traits—and its mode of expression—explicit versus implicit. The findings reveal that existing research predominantly focuses on explicit belief alignment while largely overlooking implicit personality-based flattery. Although 94.3% of surveyed experts regard AI flattery as a significant risk, substantial disagreement persists regarding the precise identification of specific flattery behaviors. This work thus provides a foundational theoretical framework and empirical evidence to support the definition, evaluation, and regulation of AI flattery.

📝 Abstract

AI sycophancy has become a prominent concern in large language model (LLM) research. Yet the term lacks a consistent definition and has been applied to behaviors ranging from agreeing with a user's false claim to excessively praising the user to withholding corrective feedback. When researchers, companies, and policymakers use the same term to describe different behaviors, evaluation results become difficult to compare, mitigation strategies fail to transfer, and systems that are resistant to one form of sycophancy continue exhibiting other forms. To address this, we make two contributions. First, we reviewed 70 papers on AI sycophancy to develop a taxonomy of how the behavior has been defined and measured. The taxonomy distinguishes (1) whether a model is sycophantic toward a user's positions and beliefs, or toward the user's broader personal traits and emotions, and (2) whether this occurs through explicit, direct language or more implicit, subtle behaviors such as framing, omission, or tone. Mapping existing literature to our taxonomy reveals that current research has focused on overt forms of sycophancy toward users' beliefs, leaving more subtle and person-directed behaviors relatively understudied. Second, we surveyed 106 experts in AI sycophancy and related fields to examine whether researchers agree on which model behaviors are sycophantic. While experts are nearly unanimous in believing that sycophancy is a significant problem in current AI systems (94.3% agree), they disagree substantially on which specific behaviors qualify. Together, these findings demonstrate that AI sycophancy is a broad family of behaviors with different measurement challenges, intervention requirements, and governance implications. Our taxonomy provides a shared vocabulary for understanding and addressing these behaviors.

Problem

Research questions and friction points this paper is trying to address.

AI sycophancy

large language models

behavioral definition

taxonomy

expert disagreement

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI sycophancy

taxonomy

large language models