🤖 AI Summary
The neural code translation field lacks a systematic, up-to-date survey. Method: We conduct a systematic literature review (SLR) of 57 core studies published between 2020 and 2025, introducing the first seven-dimensional technical framework—covering task formulation, data preprocessing, code modeling, architecture design, training strategies, evaluation protocols, and application scenarios—and integrating thematic coding, trend clustering, and cross-study comparison for qualitative and semi-quantitative analysis. Results: We identify critical bottlenecks including weak model generalizability and difficulty in cross-lingual semantic alignment. We confirm the dominance of Transformer-based models, the rising adoption of AST-enhanced modeling, and broad consensus on BLEU’s inadequacy for code translation evaluation. Furthermore, we propose reorienting evaluation toward industrial deployment contexts and publicly release the field’s first comprehensive technology landscape map and a reusable analytical template.
📝 Abstract
Code translation aims to convert code from one programming language to another automatically. It is motivated by the need for multi-language software development and legacy system migration. In recent years, neural code translation has gained significant attention, driven by rapid advancements in deep learning and large language models. Researchers have proposed various techniques to improve neural code translation quality. However, to the best of our knowledge, no comprehensive systematic literature review has been conducted to summarize the key techniques and challenges in this field. To fill this research gap, we collected 57 primary studies covering the period 2020~2025 on neural code translation. These studies are analyzed from seven key perspectives: task characteristics, data preprocessing, code modeling, model construction, post-processing, evaluation subjects, and evaluation metrics. Our analysis reveals current research trends, identifies unresolved challenges, and shows potential directions for future work. These findings can provide valuable insights for both researchers and practitioners in the field of neural code translation.