🤖 AI Summary
This work investigates how large language models (LLMs) perceive and correct spelling errors in inputs, focusing on identifying typo-specific neurons and attention heads. Methodologically, it employs activation difference analysis, causal mediation probing, attention visualization, inter-layer intervention, and token-level attribution. The study provides the first systematic empirical validation of “typo neurons” and “typo attention heads”: mid-layer neurons drive global correction, while input- and output-layer neurons independently perform local correction; typo heads attend broadly to contextual windows rather than specific tokens, and both components remain consistently active during standard semantic tasks—demonstrating dual functionality in error correction and general language understanding. Experiments further show that typo correction can be achieved without full-layer activation, revealing an internal, robust, and modular error-correction mechanism within LLMs.
📝 Abstract
This paper investigates how LLMs encode inputs with typos. We hypothesize that specific neurons and attention heads recognize typos and fix them internally using local and global contexts. We introduce a method to identify typo neurons and typo heads that work actively when inputs contain typos. Our experimental results suggest the following: 1) LLMs can fix typos with local contexts when the typo neurons in either the early or late layers are activated, even if those in the other are not. 2) Typo neurons in the middle layers are responsible for the core of typo-fixing with global contexts. 3) Typo heads fix typos by widely considering the context not focusing on specific tokens. 4) Typo neurons and typo heads work not only for typo-fixing but also for understanding general contexts.