A Unified Framework for Automated Code Transformation and Pragma Insertion

📅 2024-05-05

📈 Citations: 1

✨ Influential: 0

career value

194K/year

🤖 AI Summary

In high-level synthesis (HLS), jointly optimizing code transformations, pragma insertion, and cache-blocking size selection is challenging due to tight coupling, a vast decision space, and difficulty in guaranteeing semantic correctness. Method: This paper proposes the first unified modeling framework that jointly encodes all three aspects as a single, isomorphic optimization problem—supporting “zero-transformation” decisions—and leverages HLS compiler–driven constraint derivation coupled with nonlinear programming (NLP) to automatically and correctly optimize regular loop nests. Contribution/Results: It introduces the first paradigm for co-optimizing transformations, pragmas, and blocking sizes, with built-in semantic equivalence preservation. Evaluated on multiple benchmark kernels, the approach significantly improves quality-of-results (QoR), accurately identifies cases requiring or forbidding transformations, and generates high-performance, formally verifiable optimized code.

Technology Category

Application Category

📝 Abstract

High-level synthesis, source-to-source compilers, and various Design Space Exploration techniques for pragma insertion have significantly improved the Quality of Results of generated designs. These tools offer benefits such as reduced development time and enhanced performance. However, achieving high-quality results often requires additional manual code transformations and tiling selections, which are typically performed separately or as pre-processing steps. Although DSE techniques enable code transformation upfront, the vastness of the search space often limits the exploration of all possible code transformations, making it challenging to determine which transformations are necessary. Additionally, ensuring correctness remains challenging, especially for complex transformations and optimizations. To tackle this obstacle, we first propose a comprehensive framework leveraging HLS compilers. Our system streamlines code transformation, pragma insertion, and tiles size selection for on-chip data caching through a unified optimization problem, aiming to enhance parallelization, particularly beneficial for computation-bound kernels. Them employing a novel Non-Linear Programming (NLP) approach, we simultaneously ascertain transformations, pragmas, and tile sizes, focusing on regular loop-based kernels. Our evaluation demonstrates that our framework adeptly identifies the appropriate transformations, including scenarios where no transformation is necessary, and inserts pragmas to achieve a favorable Quality of Results.

Problem

Research questions and friction points this paper is trying to address.

Code Modification

Automation

Simplification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified Approach

Sophisticated Compiler

Concurrent Management

🔎 Similar Papers

No similar papers found.