Usability as a Weapon: Attacking the Safety of LLM-Based Code Generation via Usability Requirements

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the security degradation in large language models (LLMs) during code generation, where explicit usability requirements are prioritized over implicit security constraints. The authors propose UPAttack, a novel attack paradigm that formalizes usability demands as a security attack surface against LLMs, leveraging three types of usability pressures—functional, implementation, and trade-off—to induce violations of secure coding practices. They develop U-SPLOIT, an automated framework integrating task filtering, pressure synthesis, and dynamic vulnerability payload validation, and evaluate it across 75 multilingual scenarios covering 25 Common Weakness Enumerations (CWEs) on mainstream models including GPT-4o and Gemini-1.5-Flash. The attacks achieve up to a 98.1% success rate, exposing critical security flaws inherent in current LLM reward mechanisms.

📝 Abstract

Large Language Models (LLMs) are increasingly used for automated software development, making their ability to preserve secure coding practices critical. In practice, however, many security requirements are implicit or underspecified, whereas usability requirements are explicit and high-signal. This asymmetry motivates our investigation of usability pressure as a practical attack surface: realistic usability-oriented requirements (e.g., new features, performance constraints, or simplicity demands) can cause coding LLMs to satisfy explicit usability goals while silently dropping implicit security constraints -- a form of reward hacking. We formalize this threat as UPAttack and propose U-SPLOIT, an automated framework to craft UPAttack that (i) selects tasks where a model is initially secure, (ii) synthesizes usability pressures by identifying usability rewards of insecure alternatives across three vectors (Functionality, Implementation, Trade-off), and (iii) verifies security regression via both existing test cases and dynamically generated exploit payloads. Across 75 seed scenarios (25 CWEs x 3 cases), spanning multiple languages (Python, C, and JavaScript), U-SPLOIT achieves attack success rates up to 98.1% on multiple state-of-the-art models (e.g., GPT-5.2-chat and Gemini-3-Flash-Preview).

Problem

Research questions and friction points this paper is trying to address.

usability pressure

security regression

LLM-based code generation

reward hacking

implicit security constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

UPAttack

Usability Pressure

LLM Code Generation