🤖 AI Summary
This study addresses the critical issue of insufficient security awareness among developers using large language models (LLMs) for software development, which often leads to severe backend vulnerabilities. To mitigate this risk, the authors propose a tiered security training intervention and evaluate its effectiveness through a quasi-experimental study involving pre- and post-tests with expert-guided participant grouping, all under a fixed LLM configuration in Java Spring Boot authentication scenarios. The findings demonstrate, for the first time, that non-model-based interventions—specifically targeted security training—can significantly enhance the security posture of LLM-assisted development: the total number of vulnerabilities decreased by 31.5%, the severity-weighted vulnerability burden dropped by 38.2%, and critical vulnerabilities plummeted by 79.2%, with particularly pronounced improvements in authentication and authorization-related flaws.
📝 Abstract
This paper presents a controlled quasi-experimental developer study examining whether a layer-based security training package is associated with improved security quality in LLM-assisted implementation of an identity-centric Java Spring Boot backend. The study uses a mixed design with a within-subject pre-training versus post-training comparison and an exploratory between-subject expertise factor. Twelve developers completed matched runs under a common interface, fixed model configuration, counterbalanced task sets, and a shared starter project. Security outcomes were assessed via independent manual validation of submitted repositories by the first and second authors. The primary participant-level endpoint was a severity-weighted validated-weakness score. The post-training condition showed a significant paired reduction under an exact Wilcoxon signed-rank test ($p = 0.0059$). In aggregate, validated weaknesses decreased from 162 to 111 (31.5\%), the severity-weighted burden decreased from 432 to 267 (38.2\%), and critical findings decreased from 24 to 5 (79.2\%). The largest reductions were in authorization and object access (53.3\%) and in authentication, credential policy, and recovery weaknesses (44.7\%). Session and browser trust-boundary issues showed minimal change, while sensitive-data and cryptographic weaknesses showed only marginal improvement.
These results suggest that, under the tested conditions, post-training runs reduce validated security burden in LLM-assisted backend development without modifying the model. They do not support replacing secure defaults, static analysis, expert review, or operational hardening.