GRACE: Globally-Seeded Representation-Aware Cluster-Specific Evolution for Compiler Auto-Tuning

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Compiler auto-tuning faces a fundamental trade-off between efficiency and cross-program generalizability in pass selection and phase ordering. Method: This paper proposes a novel LLVM IR instruction-count optimization framework. Its core innovations are: (1) representation-aware contrastive learning to enhance program embeddings for improved cross-program semantic representation; (2) cluster-level specialized evolutionary search spaces constructed via program clustering, enabling efficient search-space pruning; and (3) weighted initialization, transformation-based data augmentation, and lightweight online fine-tuning. Results: Evaluated on seven benchmark datasets, the framework reduces IR instruction count by 10.09% (LLVM v10.0.0) and 10.19% (v18.1.6) over -Oz, with per-program tuning time under one second—demonstrating significant gains in both search efficiency and cross-program generalizability.

Technology Category

Application Category

📝 Abstract
Compiler pass selection and phase ordering present a significant challenge in achieving optimal program performance, particularly for objectives like code size reduction. Standard compiler heuristics offer general applicability but often yield suboptimal, program-specific results due to their one-size-fits-all nature. While iterative compilation can find tailored solutions, its prohibitive search cost limits practical use. Machine learning approaches promise faster inference but frequently struggle with generalization to unseen programs. This paper introduces GRACE, a novel framework for compiler auto-tuning, demonstrated for LLVM IR instruction count optimization. GRACE effectively curtails the search space by leveraging pass synergies and a weighted scoring method to generate initial high-quality candidate sequences and a pass pool. It then employs contrastive learning, using pass sequence-based data augmentation, to create program embeddings that facilitate similarity-aware clustering. Evolutionary search within these clusters yields a coreset of $k$ specialized pass sequences designed for robust generalization to unseen programs. At test time, GRACE efficiently selects the best coreset sequence and refines it using lightweight techniques. Experimental results on seven diverse datasets show that GRACE reduces LLVM IR instruction count by an average of 10.09% on LLVM 10.0.0 and 10.19% on LLVM 18.1.6 compared to opt -Oz, while incurring an average tuning time of less than 1s per program, demonstrating its state-of-the-art performance and practical effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Optimizing compiler pass selection to improve program performance
Reducing search space for efficient compiler auto-tuning solutions
Enhancing generalization of machine learning for unseen programs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages pass synergies to generate initial candidate sequences
Uses contrastive learning for similarity-aware program clustering
Performs evolutionary search within clusters for robust generalization
🔎 Similar Papers
No similar papers found.
Haolin Pan
Haolin Pan
Institute of Software Chinese Academy of Sciences
AI for CompilerSIMD OptimizationCompiler Technology
C
Chao Zha
Institute of Computing Technology, Chinese Academy of Sciences, China, Research Center for High Efficiency Computing Infrastructure, Zhejiang Lab, China, and University of Chinese Academy of Sciences, China
J
Jinyuan Dong
Institute of Software, Chinese Academy of Sciences, China and University of Chinese Academy of Sciences, China
M
Mingjie Xing
Institute of Software, Chinese Academy of Sciences, China
Yanjun Wu
Yanjun Wu
Institute of Software, Chinese Academy of Sciences
Computer Science