Nearly Optimal Attention Coresets

📅 2026-05-06
📈 Citations: 0
Influential: 0
📄 PDF

career value

235K/year
🤖 AI Summary
This work addresses the problem of efficiently approximating the output of attention mechanisms within a bounded query space while guaranteeing uniform approximation accuracy for all queries with bounded norm. Leveraging tools from high-dimensional geometry and probabilistic analysis, the authors construct sparse subsets of key-value pairs—termed ε-coresets—that satisfy a uniform error bound. The main contribution lies in establishing the existence of an ε-coreset of size O(√d·e^{ρ+o(ρ)}/ε) and proving a matching lower bound of Ω(√d·e^ρ/ε), thereby significantly improving upon existing results and nearly tightly characterizing the optimal size of attention coresets.
📝 Abstract
We consider the problem of estimating the Attention mechanism in small space, and prove the existence of coresets for it of nearly optimal size. Specifically, we show that for any set of unit-norm keys and values $(K,V)$ in $\mathbb{R}^d$, there exists a subset $(K',V')$ of size at most $O({\sqrt{d} e^{ρ+o(ρ)}/\varepsilon})$ such that \[ \left\| \operatorname{Attn}(q,K,V)- \operatorname{Attn}(q,K',V') \right\| \le \varepsilon \] simultaneously for all queries whose norm is bounded by $ρ$. This outperforms the best known results for this problem. We also offer an improved lower bound showing that $\varepsilon$-coresets must have size $Ω({\sqrt{d} e^ρ/ε})$.
Problem

Research questions and friction points this paper is trying to address.

Attention mechanism
coresets
space efficiency
approximation error
query-bounded norm
Innovation

Methods, ideas, or system contributions that make the work stand out.

Attention mechanism
coresets
space-efficient approximation
theoretical bounds
query-bounded attention
🔎 Similar Papers
No similar papers found.