π€ AI Summary
This work addresses a critical limitation of large language models (LLMs) in generating executable code that is compatible with usersβ specific software environments, as existing evaluations typically assume isolated or default settings. The paper formally defines and investigates the task of Environment-Aware Code Generation (EACG) for the first time, proposing a tri-axis adaptation strategy that jointly aligns model behavior along data, parameter, and cache dimensions. To support comprehensive evaluation, the authors introduce VersiBCB, the first benchmark enabling multi-package dependency resolution, execution validation, and analysis of deprecation issues. Experimental results demonstrate that the proposed approach substantially improves the compatibility and executability of generated code in real-world heterogeneous environments, revealing fundamental gaps in current LLMsβ capacity for environment awareness.
π Abstract
Recent progress in large language models (LLMs) has improved code generation, but most evaluations still test isolated, small-scale code (e.g., a single function) under default or unspecified software environments. As a result, it is unclear whether LLMs can reliably generate executable code tailored to a user's specific environment. We present the first systematic study of Environment-Aware Code Generation (EACG), where generated code must be functionally correct and directly executable under arbitrary software configurations. To enable realistic evaluation, we introduce VersiBCB, a benchmark that is multi-package, execution-verified, and deprecation-aware, capturing complex and evolving environments that prior datasets often overlook. Using VersiBCB, we investigate three complementary adaptation axes: data, parameters, and cache, and develop representative strategies for each. Our results show that current LLMs struggle with environment-specific code generation, while our adaptations improve environment compatibility and executability. These findings highlight key challenges and opportunities for deploying LLMs in practical software engineering workflows.