🤖 AI Summary
This study investigates whether large language models (LLMs) can serve as intelligent collaborators to bridge the expertise gap between cybersecurity professionals and non-experts in phishing email detection and intrusion detection. Method: We designed an LLM-based interactive collaboration framework and conducted a mixed-methods user study (n=58), complemented by cognitive task analysis and discourse coding. Contribution/Results: We provide the first empirical evidence that LLM collaboration significantly reduces false positives (in phishing detection) and false negatives (in intrusion detection), with performance gains persisting when users subsequently work independently. Moreover, interaction dimensions—including explanation style, certainty level, and linguistic tone—significantly moderate user trust and decision revision behavior. The study uncovers the “explanation–trust–revision” mechanism underlying human-AI collaboration, offering both theoretical foundations and practical design principles for trustworthy AI-assisted cybersecurity decision-making.
📝 Abstract
This study investigates whether large language models (LLMs) can function as intelligent collaborators to bridge expertise gaps in cybersecurity decision-making. We examine two representative tasks-phishing email detection and intrusion detection-that differ in data modality, cognitive complexity, and user familiarity. Through a controlled mixed-methods user study, n = 58 (phishing, n = 34; intrusion, n = 24), we find that human-AI collaboration improves task performance,reducing false positives in phishing detection and false negatives in intrusion detection. A learning effect is also observed when participants transition from collaboration to independent work, suggesting that LLMs can support long-term skill development. Our qualitative analysis shows that interaction dynamics-such as LLM definitiveness, explanation style, and tone-influence user trust, prompting strategies, and decision revision. Users engaged in more analytic questioning and showed greater reliance on LLM feedback in high-complexity settings. These results provide design guidance for building interpretable, adaptive, and trustworthy human-AI teaming systems, and demonstrate that LLMs can meaningfully support non-experts in reasoning through complex cybersecurity problems.