From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning

📅 2026-01-19

📈 Citations: 2

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Existing code completion methods, such as Fill-in-the-Middle (FIM), struggle to correct contextual errors and rely on potentially unsafe base models. Meanwhile, chat-based large language models suffer from performance degradation, and agent-based workflows incur high latency. To address these limitations, this work proposes the Search-and-Replace Infilling (SRI) framework, which extends code completion from static infilling to context-aware dynamic editing. SRI internalizes the agent-like verify-and-edit mechanism into a single inference pass, preserving low latency and general programming proficiency while maintaining instruction-following capabilities. Leveraging a synthetically constructed SRI-200K dataset and structured search-replace instructions, a model fine-tuned with only 20,000 samples—SRI-Coder—outperforms base models in completion accuracy while matching the inference speed of standard FIM.

Technology Category

Application Category

📝 Abstract

The dominant Fill-in-the-Middle (FIM) paradigm for code completion is constrained by its rigid inability to correct contextual errors and reliance on unaligned, insecure Base models. While Chat LLMs offer safety and Agentic workflows provide flexibility, they suffer from performance degradation and prohibitive latency, respectively. To resolve this dilemma, we propose Search-and-Replace Infilling (SRI), a framework that internalizes the agentic verification-and-editing mechanism into a unified, single-pass inference process. By structurally grounding edits via an explicit search phase, SRI harmonizes completion tasks with the instruction-following priors of Chat LLMs, extending the paradigm from static infilling to dynamic context-aware editing. We synthesize a high-quality dataset, SRI-200K, and fine-tune the SRI-Coder series. Extensive evaluations demonstrate that with minimal data (20k samples), SRI-Coder enables Chat models to surpass the completion performance of their Base counterparts. Crucially, unlike FIM-style tuning, SRI preserves general coding competencies and maintains inference latency comparable to standard FIM. We empower the entire Qwen3-Coder series with SRI, encouraging the developer community to leverage this framework for advanced auto-completion and assisted development.

Problem

Research questions and friction points this paper is trying to address.

code completion

context-aware editing

Fill-in-the-Middle

instruction tuning

code infilling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Search-and-Replace Infilling

Context-Aware Code Editing

Instruction Tuning