🤖 AI Summary
This work addresses the challenge that modern compilers face in efficiently navigating the vast space of code optimizations, while existing large language model (LLM)-based direct optimization approaches often introduce semantic errors and overlook fine-grained optimization opportunities. The paper proposes the first method to leverage LLMs for generating semantics-preserving compiler hints, synergistically integrating LLM reasoning with traditional compiler infrastructure through retrieval-augmented synthesis and performance-profile-guided iterative refinement. Evaluated on the PolyBench and HumanEval-CPP benchmarks, the approach achieves up to a 6.88× geometric mean speedup over conventional -O3/-Ofast optimization levels while rigorously preserving program semantics.
📝 Abstract
Code optimization remains a core objective in software development, yet modern compilers struggle to navigate the enormous optimization spaces. While recent research has looked into employing large language models (LLMs) to optimize source code directly, these techniques can introduce semantic errors and miss fine-grained compiler-level optimization opportunities. We present HintPilot, which bridges LLM-based reasoning with traditional compiler infrastructures via synthesizing compiler hints, annotations that steer compiler behavior. HintPilot employs retrieval-augmented synthesis over compiler documentation and applies profiling-guided iterative refinement to synthesize semantics-preserving and effective hints. Upon PolyBench and HumanEval-CPP benchmarks, HintPilot achieves up to 6.88x geometric mean speedup over -Ofast while preserving program correctness.