🤖 AI Summary
This study investigates whether users, when performing cybersecurity policy tasks, can be misled by adversarially manipulated AI chatbots into adopting recommendations that undermine system defenses. Method: A controlled experiment (N=15) combined behavioral observation and semi-structured interviews to simulate attacker-controlled AI outputs, quantifying user trust and its modulation by task familiarity and self-assessed confidence. Contribution/Results: Despite expressed skepticism, nearly half of participants executed high-risk instructions. Trust decisions were significantly moderated by task familiarity (β = 0.42, p < 0.05) and subjective confidence (β = −0.38, p < 0.05). This work provides the first empirical characterization of the dynamic “human-factor trust vulnerability” in AI security—where users’ cognitive biases interact with manipulated AI outputs—and offers evidence-based design principles for resilient human-AI collaborative defense frameworks resistant to adversarial manipulation.
📝 Abstract
AI chatbots are an emerging security attack vector, vulnerable to threats such as prompt injection, and rogue chatbot creation. When deployed in domains such as corporate security policy, they could be weaponized to deliver guidance that intentionally undermines system defenses. We investigate whether users can be tricked by a compromised AI chatbot in this scenario. A controlled study (N=15) asked participants to use a chatbot to complete security-related tasks. Without their knowledge, the chatbot was manipulated to give incorrect advice for some tasks. The results show how trust in AI chatbots is related to task familiarity, and confidence in their ownn judgment. Additionally, we discuss possible reasons why people do or do not trust AI chatbots in different scenarios.