Diagnosing and Repairing Citation Failures in Generative Engine Optimization

📅 2026-03-10

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses a critical gap in current generative engine optimization (GEO) methods, which overlook the underlying drivers of citation behavior and lack diagnostic capabilities for identifying why documents fail to be cited. To bridge this gap, we propose AgentGEO, a diagnostic optimization framework that introduces the first taxonomy of citation failure modes. By conducting multi-stage diagnostics, AgentGEO pinpoints specific breakdowns in the citation pipeline and leverages agent-driven tool invocation with iterative refinement strategies to selectively enhance citability. Furthermore, we establish a document-centric benchmark to evaluate generalization capabilities. Experiments demonstrate that modifying as little as 5% of the content yields a relative citation gain exceeding 40%, substantially outperforming baseline approaches, while also revealing the potential adverse effects of generic optimization strategies on long-tail content.

Technology Category

Application Category

📝 Abstract

Generative Engine Optimization (GEO) aims to improve content visibility in AI-generated responses. However, existing methods measure contribution-how much a document influences a response-rather than citation, the mechanism that actually drives traffic back to creators. Also, these methods apply generic rewriting rules uniformly, failing to diagnose why individual document are not cited. This paper introduces a diagnostic approach to GEO that asks why a document fails to be cited and intervenes accordingly. We develop a unified framework comprising: (1) the first taxonomy of citation failure modes spanning different stages of a citation pipeline; (2) AgentGEO, an agentic system that diagnoses failures using this taxonomy, selects targeted repairs from a corresponding tool library, and iterates until citation is achieved; and (3) a document-centric benchmark evaluating whether optimizations generalize across held-out queries. AgentGEO achieves over 40% relative improvement in citation rates while modifying only 5% of content, compared to 25% for baselines. Our analysis reveals that generic optimization can harm long-tail content and some documents face challenges that optimization alone cannot fully address-findings with implications for equitable visibility in AI-mediated information access.

Problem

Research questions and friction points this paper is trying to address.

Generative Engine Optimization

citation failure

AI-generated responses

content visibility

equitable visibility

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Engine Optimization

Citation Failure Diagnosis

AgentGEO