🤖 AI Summary
Existing generative testing tools struggle to effectively detect logical bugs in database query optimizers involving multi-table joins. To address this challenge, this work proposes TQS, a novel testing framework that innovatively integrates database normalization with graph isomorphism modeling. TQS formulates multi-table join query generation as a graph embedding and weighted random walk problem, leveraging data-guided schema and query generation (DSG) alongside knowledge-guided query space exploration (KQE). The framework further enhances test coverage and efficiency through bitmap indexing and the injection of noise data. Within 24 hours, TQS successfully uncovered 115 logical bugs across MySQL, MariaDB, TiDB, and X-DB, demonstrating its effectiveness and practical utility in real-world database systems.
📝 Abstract
Generation-based testing techniques have shown their effectiveness in detecting logic bugs of DBMS, which are often caused by improper implementation of query optimizers. Nonetheless, existing generation-based debug tools are limited to single-table queries and there is a substantial research gap regarding multi-table queries with join operators. In this paper, we propose TQS, a novel testing framework targeted at detecting logic bugs derived by queries involving multi-table joins. Given a target DBMS, TQS achieves the goal with two key components: Data-guided Schema and Query Generation (DSG) and Knowledge-guided Query Space Exploration (KQE). DSG addresses the key challenge of multi-table query debugging: how to generate ground-truth (query, result) pairs for verification. It adopts the database normalization technique to generate a testing schema and maintains a bitmap index for result tracking. To improve debug efficiency, DSG also artificially inserts some noises into the generated data. To avoid repetitive query space search, KQE forms the problem as isomorphic graph set discovery and combines the graph embedding and weighted random walk for query generation. We evaluated TQS on four popular DBMSs: MySQL, MariaDB, TiDB and the gray release of an industry-leading cloud-native database, anonymized as X-DB. Experimental results show that TQS is effective in finding logic bugs of join optimization in database management systems. It successfully detected 115 bugs within 24 hours, including 31 bugs in MySQL, 30 in MariaDB, 31 in TiDB, and 23 in X-DB respectively.