Large Language Models for Multilingual Code Intelligence: A Survey

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
Current large language models excel at code generation for high-resource languages such as Python but exhibit significantly degraded performance on low-resource languages like Rust and OCaml, falling short of the demands of real-world multilingual software systems. This work systematically surveys key tasks in multilingual code intelligence—namely, natural-language-instructed code generation across multiple programming languages and semantically consistent cross-lingual code translation—and reviews prevailing methodologies, benchmark datasets, and evaluation metrics. It uniquely emphasizes the cross-lingual generalization capabilities of large language models in multilingual code tasks, uncovering core challenges including inadequate support for low-resource languages and the difficulty of ensuring cross-lingual semantic consistency. The paper concludes by outlining promising directions for future research toward trustworthy multilingual code understanding and generation.
📝 Abstract
Large language models have transformed AI-assisted software engineering, but current research remains biased toward high-resource languages such as Python, with weaker performance in languages like Rust and OCaml. Since real-world systems are inherently polyglot, robust multilingual code intelligence is crucial. This survey focuses on two key tasks: multilingual code generation from shared natural-language requirements, and multilingual code translation that preserves semantics across languages. It reviews representative methods, benchmarks, and evaluation metrics, and highlights challenges and opportunities for trustworthy cross-language generalization.
Problem

Research questions and friction points this paper is trying to address.

multilingual code intelligence
large language models
code generation
code translation
cross-language generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

multilingual code intelligence
large language models
code generation
code translation
cross-language generalization
🔎 Similar Papers
2024-03-252024 IEEE/ACM First International Conference on AI Foundation Models and Software Engineering (Forge) Conference Acronym:Citations: 22