LLM-Assisted Translation of Legacy FORTRAN Codes to C++: A Cross-Platform Study

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the modernization of legacy Fortran codes in high-performance computing (HPC) by systematically evaluating the applicability and accuracy of large language models (LLMs) for cross-language translation (Fortran → C++). We propose the first reproducible proxy-based translation evaluation framework, quantifying performance across four dimensions: compilation correctness, semantic fidelity (measured via CodeBLEU), numerical consistency, and cross-platform compatibility (x86/ARM). Evaluated on diverse scientific computing benchmarks using open-source LLMs, our approach achieves up to 89% compilation success rate, 76% average semantic similarity relative to human-authored translations, and >92% numerical consistency. Our key contribution is establishing the first multi-dimensional evaluation paradigm for LLM-based translation of scientific code, empirically validating both its feasibility and inherent limitations in realistic HPC environments.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are increasingly being leveraged for generating and translating scientific computer codes by both domain-experts and non-domain experts. Fortran has served as one of the go to programming languages in legacy high-performance computing (HPC) for scientific discoveries. Despite growing adoption, LLM-based code translation of legacy code-bases has not been thoroughly assessed or quantified for its usability. Here, we studied the applicability of LLM-based translation of Fortran to C++ as a step towards building an agentic-workflow using open-weight LLMs on two different computational platforms. We statistically quantified the compilation accuracy of the translated C++ codes, measured the similarity of the LLM translated code to the human translated C++ code, and statistically quantified the output similarity of the Fortran to C++ translation.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLM-based Fortran to C++ translation accuracy
Comparing LLM-translated code with human-translated C++
Quantifying output similarity between Fortran and C++
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-assisted Fortran to C++ translation
Cross-platform compilation accuracy assessment
Statistical output similarity quantification
🔎 Similar Papers
2024-06-302024 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops)Citations: 3
2024-03-252024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering (Forge) Conference Acronym:Citations: 22
Nishath Rajiv Ranasinghe
Nishath Rajiv Ranasinghe
Los Alamos National Laboratory
seismologyGeophysicsmachine learning
Shawn M. Jones
Shawn M. Jones
Los Alamos National Laboratory
Web ScienceDigital PreservationWeb Archiving@WebSciDL
Michal Kucer
Michal Kucer
Staff Scientist, Los Alamos National Laboratory
Computer VisionDeep LearningMachine Learning
A
Ayan Biswas
Los Alamos National Laboratory, Los Alamos NM 87545
Daniel O'Malley
Daniel O'Malley
Los Alamos National Laboratory
applied mathematicsmachine learningcomputational sciencequantum computing
A
Alexander Most
Los Alamos National Laboratory, Los Alamos NM 87545
S
Selma Liliane Wanna
Los Alamos National Laboratory, Los Alamos NM 87545
A
Ajay Sreekumar
School of Information, University of Arizona, 103 E 2nd St #4, Tucson, AZ 85721