Eliminate Branches by Melding IR Instructions

๐Ÿ“… 2025-12-26
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Branch misprediction incurs substantial performance penalties on data-dependent branches, where existing hardware predictors and profile-guided approaches deliver limited gains. This paper proposes an LLVM IR-level compiler optimization that pioneers the application of sequence alignment to branch elimination: it merges semantically similar control-flow paths at the instruction-path level while enforcing semantic correctness via operand-level safety guardsโ€”without requiring hardware predicate support. Unlike conventional if-conversion, our approach avoids x86โ€™s restrictions on memory instructions and eliminates high speculative overhead by combining IR-level path alignment, static analysis, and lightweight runtime checks. Evaluated on 102 benchmarks, it achieves a 10.9% geometric mean speedup, with peak improvements up to 32ร—, and introduces significantly lower static instruction overhead than baseline methods.

Technology Category

Application Category

๐Ÿ“ Abstract
Branch mispredictions cause catastrophic performance penalties in modern processors, leading to performance loss. While hardware predictors and profile-guided techniques exist, data-dependent branches with irregular patterns remain challenging. Traditional if-conversion eliminates branches via software predication but faces limitations on architectures like x86. It often fails on paths containing memory instructions or incurs excessive instruction overhead by fully speculating large branch bodies. This paper presents Melding IR Instructions (MERIT), a compiler transformation that eliminates branches by aligning and melding similar operations from divergent paths at the IR instruction level. By observing that divergent paths often perform structurally similar operations with different operands, MERIT adapts sequence alignment to discover merging opportunities and employs safe operand-level guarding to ensure semantic correctness without hardware predication. Implemented as an LLVM pass and evaluated on 102 programs from four benchmark suites, MERIT achieves a geometric mean speedup of 10.9% with peak improvements of 32x compared to hardware branch predictor, demonstrating the effectiveness with reduced static instruction overhead.
Problem

Research questions and friction points this paper is trying to address.

Eliminates data-dependent branches with irregular patterns
Reduces performance loss from branch mispredictions in processors
Avoids limitations of traditional if-conversion on x86 architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Melds similar operations from divergent paths at IR level
Uses sequence alignment to discover branch merging opportunities
Employs operand-level guarding for correctness without hardware predication