Leveraging Large Language Models to Detect Missed Peephole Optimizations

📅 2025-08-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of detecting missed peephole optimizations in compiler optimization passes. We propose Lampo, a novel framework that leverages large language models (LLMs) to generate candidate optimizations, employs LLVM’s translation validation infrastructure to formally verify semantic equivalence, and integrates a feedback-driven iterative refinement process for efficient and reliable discovery. Lampo uniquely bridges generative AI and formal verification—marking the first such integration for peephole optimization discovery—thereby overcoming inherent limitations of rule-based or exhaustive search approaches. Evaluated on LLVM, Lampo detects 17 of 25 known missed optimizations on average (up to 22), and within seven months, identifies 26 previously unknown optimization opportunities: 15 have been confirmed by LLVM developers, and 6 have been upstreamed into the main LLVM codebase.

Technology Category

Application Category

📝 Abstract
By replacing small, suboptimal instruction sequences within programs with a more efficient equivalent, peephole optimization can not only directly optimize code size and performance, but also potentially enables further transformations in the subsequent optimization pipeline. Although peephole optimization is a critical class of compiler optimizations, discovering new and effective peephole optimizations is challenging as the instruction sets can be extremely complex and diverse. Previous methods either do not scale well or can only capture a limited subset of peephole optimizations. In this work, we leverage Large Language Models (LLMs) to detect missed peephole optimizations. We propose Lampo, a novel automated framework that synergistically combines the creative but unreliable code optimization ability of LLMs with rigorous correctness verification performed by translation validation tools, integrated in a feedback-driven iterative process. Through a comprehensive evaluation within LLVM ecosystems, we show that Lampo can successfully detect up to 17 out of 25 previously reported missed optimizations in LLVM on average, and that 22 out of 25 can potentially be found by Lampo with different LLMs. For comparison, the state-of-the-art superoptimizer for LLVM, Souper, identified 15 of them. Moreover, within seven months of development and intermittent experiments, Lampo found 26 missed peephole optimizations, 15 of which have been confirmed and 6 already fixed. These results demonstrate Lampo's strong potential in continuously detecting missed peephole optimizations.
Problem

Research questions and friction points this paper is trying to address.

Detecting missed peephole optimizations in compilers
Overcoming limitations of previous optimization discovery methods
Ensuring correctness of LLM-generated code optimizations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging LLMs to detect missed peephole optimizations
Combining LLMs with rigorous correctness verification tools
Feedback-driven iterative process for optimization discovery