Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

📅 2026-02-07

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the critical challenge that large language models (LLMs) for code generation often introduce security vulnerabilities, while existing alignment methods typically compromise functional correctness. To reconcile this trade-off, we propose SecCoderX, a novel framework that leverages mature vulnerability detection resources to construct realistic, diverse vulnerability-inducing tasks and trains a reasoning-based vulnerability reward model for online reinforcement learning. This approach provides LLMs with scalable and reliable safety signals without degrading code functionality. Experimental results demonstrate that SecCoderX improves the Effective Safety Rate by approximately 10% while preserving functional performance, substantially alleviating the tension between safety and utility—unlike prior methods, which commonly incur performance drops of 14% to 54%.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly used in software development, yet their tendency to generate insecure code remains a major barrier to real-world deployment. Existing secure code alignment methods often suffer from a functionality--security paradox, improving security at the cost of substantial utility degradation. We propose SecCoderX, an online reinforcement learning framework for functionality-preserving secure code generation. SecCoderX first bridges vulnerability detection and secure code generation by repurposing mature detection resources in two ways: (i) synthesizing diverse, reality-grounded vulnerability-inducing coding tasks for online RL rollouts, and (ii) training a reasoning-based vulnerability reward model that provides scalable and reliable security supervision. Together, these components are unified in an online RL loop to align code LLMs to generate secure and functional code. Extensive experiments demonstrate that SecCoderX achieves state-of-the-art performance, improving Effective Safety Rate (ESR) by approximately 10% over unaligned models, whereas prior methods often degrade ESR by 14-54%. We release our code, dataset and model checkpoints at https://github.com/AndrewWTY/SecCoderX.

Problem

Research questions and friction points this paper is trying to address.

secure code generation

functionality-security trade-off

large language models

vulnerability

code security

Innovation

Methods, ideas, or system contributions that make the work stand out.

online reinforcement learning

vulnerability reward model

secure code generation