An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention

📅 2026-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that large language models, constrained by fixed context lengths, struggle to effectively process code sequences exceeding their training length. The authors propose a zero-shot inference method that requires no fine-tuning, leveraging optimized positional encoding and efficient attention mechanisms to systematically evaluate the context-length extrapolation capabilities of existing techniques on long code completion tasks. For the first time, they provide a comprehensive comparison of multiple positional embedding schemes and attention strategies in ultra-long code scenarios, revealing significant differences in their extrapolation performance. This study offers empirical evidence and practical guidance for enhancing the ability of large models to handle extremely long code sequences.

Technology Category

Application Category

📝 Abstract
The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation. Despite these advancements, its effectiveness is constrained by fixed context lengths, limiting its ability to generalize across long, domain-specific code sequences. To address this challenge, we investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms. Our goal is to provide a thorough analysis of current approaches that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.
Problem

Research questions and friction points this paper is trying to address.

context length extrapolation
long code
position embeddings
attention mechanisms
code completion
Innovation

Methods, ideas, or system contributions that make the work stand out.

context length extrapolation
positional embeddings
efficient attention
long code completion
zero-shot inference
🔎 Similar Papers
No similar papers found.