🤖 AI Summary
This work addresses the limitations of existing SQL injection detection methods, which often fail to effectively identify obfuscated and evolving attacks due to their neglect of request-response context. To overcome this, the authors propose a multi-agent honeypot system comprising three specialized agents—request generator, database responder, and traffic monitor—that collaboratively construct, for the first time, a large-scale annotated dataset incorporating realistic response context. This dataset transcends the traditional payload-only paradigm that has constrained prior approaches. Leveraging this enriched data, a CNN-BiLSTM model is trained and demonstrates significant performance gains, achieving over 40% improvement in accuracy across multiple detection tasks and substantially enhancing the capability to detect sophisticated SQL injection attacks.
📝 Abstract
SQL injection remains a major threat to web applications, as existing defenses often fail against obfuscation and evolving attacks because of neglecting the request-response context. This paper presents a context-enriched SQL injection detection framework, focusing on constructing a high-quality request-response dataset via a multi-agent honeypot system: the Request Generator Agent produces diverse malicious/benign requests, the Database Response Agent mediates interactions to ensure authentic responses while protecting production data, and the Traffic Monitor pairs requests with responses, assigns labels, and cleans data, yielding totally 140,973 labeled pairs with contextual cues absent in payload-only data. Experiments show that models trained on this context dataset outperform payload-only counterparts: CNN and BiLSTM achieve over 40\% accuracy improvement in different tasks, validating that the request-response context enhances the detection of evolving and obfuscated attacks.