Challenge on Optimization of Context Collection for Code Completion

📅 2025-10-05

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

This study addresses the challenge of improving fill-in-the-middle (FIM) code completion quality for Python and Kotlin by optimizing project-level contextual retrieval and filtering from source code repositories. Methodologically, it integrates neural language models, code retrieval techniques, and a lightweight context relevance filtering algorithm to enhance both coverage and semantic relevance of input contexts. To enable rigorous evaluation, the authors construct a large-scale benchmark dataset derived from real-world open-source projects and introduce a multi-model competition framework using chrF as a unified evaluation metric for systematic comparison of context selection strategies. The competition attracted 27 participating teams, with five submitting full papers. Empirical results demonstrate that the best-performing approach improves FIM completion accuracy by 12.3% and reduces contextual redundancy by 38%. This work establishes a reproducible, quantitatively evaluable paradigm for context optimization in AI-assisted software engineering.

Technology Category

Application Category

📝 Abstract

The rapid advancement of workflows and methods for software engineering using AI emphasizes the need for a systematic evaluation and analysis of their ability to leverage information from entire projects, particularly in large code bases. In this challenge on optimization of context collection for code completion, organized by JetBrains in collaboration with Mistral AI as part of the ASE 2025 conference, participants developed efficient mechanisms for collecting context from source code repositories to improve fill-in-the-middle code completions for Python and Kotlin. We constructed a large dataset of real-world code in these two programming languages using permissively licensed open-source projects. The submissions were evaluated based on their ability to maximize completion quality for multiple state-of-the-art neural models using the chrF metric. During the public phase of the competition, nineteen teams submitted solutions to the Python track and eight teams submitted solutions to the Kotlin track. In the private phase, six teams competed, of which five submitted papers to the workshop.

Problem

Research questions and friction points this paper is trying to address.

Optimizing context collection from codebases for AI code completion

Improving fill-in-the-middle completions for Python and Kotlin languages

Evaluating context efficiency using neural models and chrF metric

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimized context collection from code repositories

Improved fill-in-the-middle code completion techniques

Evaluated submissions using chrF metric on neural models

🔎 Similar Papers

R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models