π€ AI Summary
This work addresses the semantic limitations of loop-level approaches in automatic C++-to-OpenMP translation, which struggle to capture function-level contextual information. To this end, we propose an OpenMP-oriented domain-specific encoder-decoder Transformer model. Methodologically: (1) we design a parallelism-aware pretraining objective that models shared-memory parallel patterns at the function level; and (2) we introduce OMPBLEU, a composite evaluation metric jointly assessing correctness of OpenMP constructs and overall code quality. Experiments demonstrate that our approach significantly outperforms conventional parallelization tools and general-purpose code language models on function-level translation tasks. It exhibits strong generalization capability and robustness on real-world codebases. Our work establishes a novel paradigm for large language modelβdriven migration of high-performance computing code to OpenMP.
π Abstract
Recent advances in large language models (LLMs) have significantly accelerated progress in code translation, enabling more accurate and efficient transformation across programming languages. While originally developed for natural language processing, LLMs have shown strong capabilities in modeling programming language syntax and semantics, outperforming traditional rule-based systems in both accuracy and flexibility. These models have streamlined cross-language conversion, reduced development overhead, and accelerated legacy code migration. In this paper, we introduce OMPILOT, a novel domain-specific encoder-decoder transformer tailored for translating C++ code into OpenMP, enabling effective shared-memory parallelization. OMPILOT leverages custom pre-training objectives that incorporate the semantics of parallel constructs and combines both unsupervised and supervised learning strategies to improve code translation robustness. Unlike previous work that focused primarily on loop-level transformations, OMPILOT operates at the function level to capture a wider semantic context. To evaluate our approach, we propose OMPBLEU, a novel composite metric specifically crafted to assess the correctness and quality of OpenMP parallel constructs, addressing limitations in conventional translation metrics.