🤖 AI Summary
Existing endpoint honeypots struggle to simultaneously achieve flexibility, deep interactivity, and high deception fidelity, limiting their effectiveness against novel attacks. This paper introduces HoneyGPT—the first ChatGPT-based endpoint honeypot system—designed to reconcile these competing objectives. Leveraging a novel structured Chain-of-Thought (CoT) prompting framework, HoneyGPT integrates long-term memory, dynamic interaction modeling, and semantic parsing of security logs, enabling the first systematic co-optimization of all three capabilities. Baseline experiments demonstrate significant improvements in the balanced performance across flexibility, interactivity, and deception metrics. During a three-month real-world deployment, HoneyGPT increased detection of previously unseen attack vectors, extended attacker average interaction duration by 3.2×, and achieved a deception success rate of 91.4%.
📝 Abstract
Honeypots, as a strategic cyber-deception mechanism designed to emulate authentic interactions and bait unauthorized entities, often struggle with balancing flexibility, interaction depth, and deception. They typically fail to adapt to evolving attacker tactics, with limited engagement and information gathering. Fortunately, the emergent capabilities of large language models and innovative prompt-based engineering offer a transformative shift in honeypot technologies. This paper introduces HoneyGPT, a pioneering shell honeypot architecture based on ChatGPT, characterized by its cost-effectiveness and proactive engagement. In particular, we propose a structured prompt engineering framework that incorporates chain-of-thought tactics to improve long-term memory and robust security analytics, enhancing deception and engagement. Our evaluation of HoneyGPT comprises a baseline comparison based on a collected dataset and a three-month field evaluation. The baseline comparison demonstrates HoneyGPT's remarkable ability to strike a balance among flexibility, interaction depth, and deceptive capability. The field evaluation further validates HoneyGPT's superior performance in engaging attackers more deeply and capturing a wider array of novel attack vectors.