🤖 AI Summary
While large language models (LLMs) for code generation exhibit increasingly sophisticated capabilities, their security alignment remains critically underaddressed—current models frequently generate insecure code containing Common Weakness Enumeration (CWE)-listed vulnerabilities. Method: We propose ProSec, a proactive security alignment framework featuring (i) a novel CWE-driven erroneous-induction scenario synthesis mechanism to construct a high-quality security alignment dataset seven times larger than prior benchmarks, and (ii) a security-preference-aware reinforcement learning objective inspired by Direct Preference Optimization (DPO) and Reinforcement Learning from Human Feedback (RLHF), integrated with instruction tuning for iterative safety enhancement. Contribution/Results: Experiments demonstrate that ProSec improves model security by 25.2%–91.4%, reduces vulnerability generation rates by 25×, and preserves functional correctness of generated code. This work establishes a scalable methodology and empirical foundation for the secure and trustworthy deployment of code LLMs.
📝 Abstract
Recent advances in code-specific large language models (LLMs) have greatly enhanced code generation and refinement capabilities. However, the safety of code LLMs remains under-explored, posing potential risks as insecure code generated by these models may introduce vulnerabilities into real-world systems. Previous work proposes to collect security-focused instruction-tuning dataset from real-world vulnerabilities. It is constrained by the data sparsity of vulnerable code, and has limited applicability in the iterative post-training workflows of modern LLMs. In this paper, we propose ProSec, a novel proactive security alignment approach designed to align code LLMs with secure coding practices. ProSec systematically exposes the vulnerabilities in a code LLM by synthesizing error-inducing coding scenarios from Common Weakness Enumerations (CWEs), and generates fixes to vulnerable code snippets, allowing the model to learn secure practices through advanced preference learning objectives. The scenarios synthesized by ProSec triggers 25 times more vulnerable code than a normal instruction-tuning dataset, resulting in a security-focused alignment dataset 7 times larger than the previous work. Experiments show that models trained with ProSec are 25.2% to 91.4% more secure compared to previous work without degrading models' utility.