🤖 AI Summary
In high-performance computing (HPC), automatic parallelization and data movement optimizations are severely hindered by compilers’ inability to accurately analyze stride accesses and loop-carried dependencies in multi-level nested loops traversing multidimensional arrays. To address this, this paper proposes a symbolic induction-based loop optimization technique. It innovatively models loop strides as symbolic variables, enabling unified characterization of both stride access patterns and dependency relations—thereby overcoming fundamental limitations of conventional low-level intermediate representations. By integrating symbolic execution with inductive reasoning, the method enables fine-grained data-flow analysis and synergistically coordinates software prefetching, pointer increment optimization, and register allocation. Evaluated on representative computational kernels—including atmospheric modeling and numerical solvers—the approach achieves up to 12× speedup over state-of-the-art methods.
📝 Abstract
Scientific computing applications heavily rely on multi-level loop nests operating on multidimensional arrays. This presents multiple optimization opportunities from exploiting parallelism to reducing data movement through prefetching and improved register usage. HPC frameworks often delegate fine-grained data movement optimization to compilers, but their low-level representations hamper analysis of common patterns, such as strided data accesses and loop-carried dependencies. In this paper, we introduce symbolic, inductive loop optimization (SILO), a novel technique that models data accesses and dependencies as functions of loop nest strides. This abstraction enables the automatic parallelization of sequentially-dependent loops, as well as data movement optimizations including software prefetching and pointer incrementation to reduce register spills. We demonstrate SILO on fundamental kernels from scientific applications with a focus on atmospheric models and numerical solvers, achieving up to 12$ imes$ speedup over the state of the art.