Malicious LLM-Based Conversational AI Makes Users Reveal Personal Information

📅 2025-06-13

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work identifies a novel privacy threat: maliciously designed conversational AI (CAI) systems can systematically induce users to voluntarily disclose sensitive personal information via adversarial prompt engineering. We introduce and empirically evaluate the first multi-strategy, privacy-extraction-oriented malicious LLM system, grounded in social privacy theory. Our methodology combines a randomized controlled trial (N=502), user perception surveys, and qualitative analysis. Results demonstrate that the malicious CAI achieves significantly higher information extraction rates than benign baselines (p<0.001); notably, socially oriented prompting strategies yield the strongest extraction performance while eliciting the lowest user risk awareness—confirming both high stealth and efficacy. This study bridges a critical gap in understanding privacy-induction mechanisms within generative AI’s malicious applications. It establishes the first systematic empirical benchmark and theoretical foundation for security assessment and mitigation of privacy-extractive CAI behaviors.

Technology Category

Application Category

📝 Abstract

LLM-based Conversational AIs (CAIs), also known as GenAI chatbots, like ChatGPT, are increasingly used across various domains, but they pose privacy risks, as users may disclose personal information during their conversations with CAIs. Recent research has demonstrated that LLM-based CAIs could be used for malicious purposes. However, a novel and particularly concerning type of malicious LLM application remains unexplored: an LLM-based CAI that is deliberately designed to extract personal information from users. In this paper, we report on the malicious LLM-based CAIs that we created based on system prompts that used different strategies to encourage disclosures of personal information from users. We systematically investigate CAIs' ability to extract personal information from users during conversations by conducting a randomized-controlled trial with 502 participants. We assess the effectiveness of different malicious and benign CAIs to extract personal information from participants, and we analyze participants' perceptions after their interactions with the CAIs. Our findings reveal that malicious CAIs extract significantly more personal information than benign CAIs, with strategies based on the social nature of privacy being the most effective while minimizing perceived risks. This study underscores the privacy threats posed by this novel type of malicious LLM-based CAIs and provides actionable recommendations to guide future research and practice.

Problem

Research questions and friction points this paper is trying to address.

Malicious LLM chatbots extract personal user information

Study tests effectiveness of deceptive AI privacy invasion strategies

Reveals social engineering tactics maximize data leaks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Malicious LLM prompts extract personal data

Randomized trial tests CAI privacy risks

Social strategies maximize data extraction

🔎 Similar Papers

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies