Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning

📅 2024-10-04
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
Existing code language models rely on standard next-token prediction (NTP) for Fill-in-the-Middle (FIM) tasks, limiting their ability to model long-range left-right context alignment and resulting in weak intermediate infilling planning. This work introduces a novel *prospective FIM* paradigm, where horizon-length prediction—the number of tokens to be generated—is formulated as an explicit planning objective. This enables models to autonomously identify infilling boundaries without rule-based post-processing or strong data assumptions. Our method operates within a token-level autoregressive framework, jointly optimizing NTP and horizon-length prediction in a multi-task learning setup, and is fully compatible with mainstream architectures. Evaluated on file-level and repository-level FIM benchmarks, our approach achieves up to 24% relative improvement, significantly enhancing open-domain code reasoning and generalizable infilling capabilities. Training overhead is negligible, and inference incurs zero computational cost.

Technology Category

Application Category

📝 Abstract
Fill-in-the-Middle (FIM) has become integral to code language models, enabling generation of missing code given both left and right contexts. However, the current FIM training paradigm, which reorders original training sequences and then performs regular next-token prediction (NTP), often leads to models struggling to generate content that aligns smoothly with the surrounding context. Crucially, while existing works rely on rule-based post-processing to circumvent this weakness, such methods are not practically usable in open-domain code completion tasks as they depend on restrictive, dataset-specific assumptions (e.g., generating the same number of lines as in the ground truth). Moreover, model performance on FIM tasks deteriorates significantly without these unrealistic assumptions. We hypothesize that NTP alone is insufficient for models to learn effective planning conditioned on the distant right context, a critical factor for successful code infilling. To overcome this, we propose Horizon-Length Prediction (HLP), a novel training objective that teaches models to predict the number of remaining middle tokens (i.e., horizon length) at each step. HLP advances FIM with lookahead planning, enabling models to inherently learn infilling boundaries for arbitrary left and right contexts without relying on dataset-specific post-processing. Our evaluation across different models and sizes shows that HLP significantly improves FIM performance by up to 24% relatively on diverse benchmarks, across file-level and repository-level, and without resorting to unrealistic post-processing methods. Furthermore, the enhanced planning capability gained through HLP boosts model performance on code reasoning. Importantly, HLP only incurs negligible training overhead and no additional inference cost, ensuring its practicality for real-world scenarios.
Problem

Research questions and friction points this paper is trying to address.

Improve code infilling alignment with surrounding context
Enhance planning for distant right context in FIM
Predict infilling boundaries without post-processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Horizon-Length Prediction for code infilling
Lookahead planning with HLP training objective
No extra inference cost with improved performance
🔎 Similar Papers
Yifeng Ding
Yifeng Ding
University of Illinois at Urbana-Champaign
Software engineeringGenerative model
H
Hantian Ding
AWS AI Labs
S
Shiqi Wang
AWS AI Labs
Q
Qing Sun
AWS AI Labs
V
Varun Kumar
AWS AI Labs
Z
Zijian Wang
AWS AI Labs