X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

📅 2026-01-11
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scalability limitations of large code models in competitive programming, which often stem from reliance on real-world data. To overcome this, we propose SynthSmith—the first framework to train competition-level code models exclusively on synthetic data. Our approach employs a feature-driven synthesis strategy to automatically generate programming tasks, reference solutions, and test cases, followed by supervised fine-tuning and code-centric reinforcement learning. A systematic ablation study validates the design choices. The resulting model, X-Coder-7B, achieves pass rates of 62.9 avg@8 on LiveCodeBench v5 and 55.8 on v6, outperforming several 14B-parameter models. This study provides the first empirical evidence that purely synthetic data can effectively support complex code reasoning, demonstrating both feasibility and scalability in training high-performance code generation models.

Technology Category

Application Category

📝 Abstract
Competitive programming poses a significant challenge for Code LLMs. While recent models have shown promise, they heavily rely on finite real-world data, raising concerns about scalability and contamination. In this paper, we investigate a critical question: Can we elevate models to expert-level reasoning performance using fully synthetic data? In response, we first observe that off-the-shelf synthesis methods yield suboptimal results in this domain. To address this, we systematically investigate the key factors governing synthetic data quality. Leveraging these findings, we significantly advance the feature-based synthesis paradigm via domain-specific evolution and a dual-verification strategy, promoting task solvability, solution correctness, and test accuracy. Using this high-quality synthetic data, we train the X-Coder model series under an SFT-then-RL paradigm. X-Coder-7B shows significant performance gains on the challenging LiveCodeBench v5 (62.9% avg@8) and v6 (55.8% avg@8), outperforming larger models trained on real-world data. Extensive analysis distills valuable insights into synthetic data scaling, the necessity of domain-adapted feature evolution, and code-centric reinforcement.
Problem

Research questions and friction points this paper is trying to address.

competitive programming
code LLMs
synthetic data
reasoning
scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic data
code generation
competitive programming
reinforcement learning
scaling laws
🔎 Similar Papers
No similar papers found.