🤖 AI Summary
This work addresses three key challenges in compiler auto-optimization: misalignment between program representations and optimization semantics, inefficient agent-environment interaction, and sparse reward signals. Methodologically, we propose a knowledge- and data-coordinated intelligent compilation optimization framework: (1) constructing a structured program knowledge graph and a high-quality optimization trajectory dataset; (2) designing a knowledge-guided, adaptive pass-sequence generation mechanism; and (3) integrating large language models, static analysis, reinforcement learning, and supervised fine-tuning into a context-aware hybrid training paradigm. Our contribution is the first semantic-aligned compiler optimization agent architecture, achieving an average 12.7% performance improvement and 43% reduction in optimization time on LLVM standard benchmarks—significantly outperforming state-of-the-art methods. The implementation is publicly available.
📝 Abstract
Compiler optimization is crucial for enhancing program performance by transforming the sequence of optimization passes while maintaining correctness. Despite the promising potential of large language models (LLMs)-based agent for software optimization, automating compiler optimization remains challenging due to: (1) semantic misalignment between abstract program representations and concrete optimization passes, (2) inefficient interaction mechanisms between agents and compiler environments, and (3) reward sparsity from the extensive decision-making process within large optimization spaces. This paper introduces extbf{AwareCompiler}, an agentic framework for compiler optimization that addresses these challenges through three key innovations: structured knowledge integration and dataset construction, knowledge-driven adaptive pass generation, and data-driven hybrid training pipeline. Experimental results on standard benchmarks demonstrate that AwareCompiler significantly outperforms existing baselines in both performance and efficiency, highlighting the effectiveness of our synergistic knowledge-data-driven approach. Our code is publicly available at https://github.com/LHY-24/AwareCompiler.