Prompt Injection attack against LLM-integrated Applications

📅 2023-06-08
🏛️ arXiv.org
📈 Citations: 536
Influential: 38
📄 PDF
🤖 AI Summary
Prompt injection attacks pose an increasingly severe security threat to large language model (LLM) integrated applications, yet existing black-box attack methods suffer from limited practical efficacy. Method: This paper proposes HouYi—the first real-world-oriented, three-stage black-box prompt injection framework comprising pre-prompt injection, context-aware segmentation, and malicious payload delivery. HouYi uniquely enables automated triggering of high-impact consequences—including arbitrary LLM misuse and application-level prompt stealing—via black-box fuzzing, context-aware prompt engineering, and web-injection-inspired modeling. Contribution/Results: Evaluated through real-world penetration testing across 36 mainstream LLM applications, HouYi uncovered 31 critical vulnerabilities, independently confirmed by ten vendors—including Notion—with impact on millions of users. The work significantly advances LLM security practice by bridging the gap between theoretical attack models and deployable, scalable exploitation techniques.
📝 Abstract
Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, their extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities and implications of prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis on ten commercial applications, highlighting the constraints of current attack strategies in practice. Prompted by these limitations, we subsequently formulate HouYi, a novel black-box prompt injection attack technique, which draws inspiration from traditional web injection attacks. HouYi is compartmentalized into three crucial elements: a seamlessly-incorporated pre-constructed prompt, an injection prompt inducing context partition, and a malicious payload designed to fulfill the attack objectives. Leveraging HouYi, we unveil previously unknown and severe attack outcomes, such as unrestricted arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection. 10 vendors have validated our discoveries, including Notion, which has the potential to impact millions of users. Our investigation illuminates both the possible risks of prompt injection attacks and the possible tactics for mitigation.
Problem

Research questions and friction points this paper is trying to address.

Analyzes prompt injection attacks on LLM-integrated applications
Introduces HouYi, a black-box technique for practical attack execution
Reveals vulnerabilities in real applications, validating risks and mitigations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel black-box prompt injection technique HouYi
Three-part injection method for context partition
Unveils severe attack outcomes like unrestricted LLM usage
🔎 Similar Papers
No similar papers found.
Y
Yi Liu
Griffith University
Gelei Deng
Gelei Deng
Nanyang Technological University
CybersecuritySystem securityRobotics SecurityAI SecuritySoftware Testing
Yuekang Li
Yuekang Li
Lecturer (Assistant Professor), University of New South Wales
Software EngineeringSoftware SecurityAI Red Teaming
K
Kailong Wang
Huazhong University of Science and Technology
T
Tianwei Zhang
Nanyang Technological University
Yepang Liu
Yepang Liu
Associate Professor, CSE, Southern University of Science and Technology
Software testing and analysisempirical software engineeringsoftware securitycyber-physical
H
Haoyu Wang
Huazhong University of Science and Technology
Y
Yanhong Zheng
Tianjin University
Y
Yang Liu
Nanyang Technological University