FuzzySQL: Uncovering Hidden Vulnerabilities in DBMS Special Features with LLM-Driven Fuzzing

📅 2026-02-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional database fuzzing struggles to cover obscure yet critical DBMS-specific features such as GTID and stored procedures, often missing deep-seated vulnerabilities. To address this limitation, this work proposes FuzzySQL, an adaptive fuzzing framework powered by large language models (LLMs) that integrates grammar-guided input generation, logic-aware mutation, and hybrid error correction combining rule-based and semantic techniques. This approach substantially enhances semantic diversity and execution coverage over specialized feature paths. Evaluated on five widely used DBMSs—including MySQL and MariaDB—FuzzySQL uncovered 37 previously unknown bugs, seven of which involve specialized features; 29 have been confirmed by vendors, 14 patched, and nine assigned CVE identifiers.

Technology Category

Application Category

📝 Abstract
Traditional database fuzzing techniques primarily focus on syntactic correctness and general SQL structures, leaving critical yet obscure DBMS features, such as system-level modes (e.g., GTID), programmatic constructs (e.g., PROCEDURE), advanced process commands (e.g., KILL), largely underexplored. Although rarely triggered by typical inputs, these features can lead to severe crashes or security issues when executed under edge-case conditions. In this paper, we present FuzzySQL, a novel LLM-powered adaptive fuzzing framework designed to uncover subtle vulnerabilities in DBMS special features. FuzzySQL combines grammar-guided SQL generation with logic-shifting progressive mutation, a novel technique that explores alternative control paths by negating conditions and restructuring execution logic, synthesizing structurally and semantically diverse test cases. To further ensure deeper execution coverage of the back end, FuzzySQL employs a hybrid error repair pipeline that unifies rule-based patching with LLM-driven semantic repair, enabling automatic correction of syntactic and context-sensitive failures. We evaluate FuzzySQL across multiple DBMSs, including MySQL, MariaDB, SQLite, PostgreSQL and Clickhouse, uncovering 37 vulnerabilities, 7 of which are tied to under-tested DBMS special features. As of this writing, 29 cases have been confirmed with 9 assigned CVE identifiers, 14 already fixed by vendors, and additional vulnerabilities scheduled to be patched in upcoming releases. Our results highlight the limitations of conventional fuzzers in semantic feature coverage and demonstrate the potential of LLM-based fuzzing to discover deeply hidden bugs in complex database systems.
Problem

Research questions and friction points this paper is trying to address.

DBMS special features
fuzzing
vulnerability discovery
edge-case conditions
security issues
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven fuzzing
logic-shifting mutation
grammar-guided SQL generation
hybrid error repair
DBMS special features
🔎 Similar Papers
No similar papers found.
Yongxin Chen
Yongxin Chen
Georgia Institute of Technology
control theorymachine learningroboticsoptimal transportoptimization
Z
Zhiyuan Jiang
National University of Defense Technology
Chao Zhang
Chao Zhang
Tsinghua University
software and system securityAI for securityblockchaindata security
H
Haoran Xu
National University of Defense Technology
S
Shenglin Xu
National University of Defense Technology
J
Jianping Tang
Hunan University
Zheming Li
Zheming Li
Sandia National Laboratories
IC engineLaser diagnostic
P
Peidai Xie
National University of Defense Technology
Yongjun Wang
Yongjun Wang
Capital Medical University
Neurology