PerfCoder: Large Language Models for Interpretable Code Performance Optimization

📅 2025-12-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current large language models (LLMs) face significant bottlenecks in autonomously generating high-performance code, primarily due to the lack of interpretable, performance-oriented supervision mechanisms. To address this, we propose a novel reinforcement fine-tuning paradigm—Runtime Measurement–Driven Preference Alignment with Human-Readable Optimization Trajectory Supervision (RFT)—that enables input-aware, policy-level optimization decisions and generates interpretable, actionable feedback. We further introduce a planner-and-optimizer collaborative architecture that supports end-to-end, iteration-free translation from source code to optimized code. Evaluated on a real-world optimization trajectory dataset, our approach achieves state-of-the-art performance on the PIE benchmark, outperforming all existing methods. It substantially enhances the optimization capability of both 32B open-weight models and GPT-5, achieving the highest measured speedup ratios and effective optimization rates.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have achieved remarkable progress in automatic code generation, yet their ability to produce high-performance code remains limited--a critical requirement in real-world software systems. We argue that current LLMs struggle not only due to data scarcity but, more importantly, because they lack supervision that guides interpretable and effective performance improvements. In this work, we introduce PerfCoder, a family of LLMs specifically designed to generate performance-enhanced code from source code via interpretable, customized optimizations. PerfCoder is fine-tuned on a curated collection of real-world optimization trajectories with human-readable annotations, and preference-aligned by reinforcement fine-tuning using runtime measurements, enabling it to propose input-specific improvement strategies and apply them directly without relying on iterative refinement. On the PIE code performance benchmark, PerfCoder surpasses all existing models in both runtime speedup and effective optimization rate, demonstrating that performance optimization cannot be achieved by scale alone but requires optimization stratetgy awareness. In addition, PerfCoder can generate interpretable feedback about the source code, which, when provided as input to a larger LLM in a planner-and-optimizer cooperative workflow, can further improve outcomes. Specifically, we elevate the performance of 32B models and GPT-5 to new levels on code optimization, substantially surpassing their original performance.
Problem

Research questions and friction points this paper is trying to address.

Optimizes code for performance using interpretable LLM-based strategies.
Generates high-performance code via supervised, annotated optimization trajectories.
Enhances existing LLMs' code optimization through interpretable feedback integration.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned on annotated real-world optimization trajectories
Preference-aligned via reinforcement fine-tuning with runtime measurements
Generates interpretable feedback for cooperative planner-optimizer workflows
🔎 Similar Papers
No similar papers found.