🤖 AI Summary
Existing compiler testing approaches suffer from low input-generation efficiency under complex syntactic and semantic constraints, as current SMT solvers and evolutionary algorithms struggle with slow solving times and poor scalability for deep, intricate constraints.
Method: This paper proposes FANDANGO-RS, an efficient constrained-input generation framework that integrates context-free grammars with semantic constraints. Its core innovation lies in automatically compiling grammars into the Rust type system and synergistically combining a customized evolutionary algorithm with lightweight SMT solving to ensure both syntactic validity and semantic correctness during rapid search.
Contribution/Results: Evaluated on a C subset, FANDANGO-RS generates 401 high-diversity, high-complexity valid test cases per minute—accelerating generation from hours to seconds. It significantly improves scalability in strongly constrained scenarios, overcoming key bottlenecks of prior methods in compiler testing.
📝 Abstract
Language-based testing combines context-free grammar definitions with semantic constraints over grammar elements to generate test inputs. By pairing context-free grammars with constraints, users have the expressiveness of unrestricted grammars while retaining simple structure. However, producing inputs in the presence of such constraints can be challenging. In past approaches, SMT solvers have been found to be very slow at finding string solutions; evolutionary algorithms are faster and more general, but current implementations still struggle with complex constraints that would be required for domains such as compiler testing. In this paper, we present a novel approach for evolutionary language-based testing that improves performance by 3-4 orders of magnitude over the current state of the art, reducing hours of generation and constraint solving time to seconds. We accomplish this by (1) carefully transforming grammar definitions into Rust types and trait implementations, ensuring that the compiler may near-maximally optimize arbitrary operations on arbitrary grammars; and (2) using better evolutionary algorithms that improve the ability of language-based testing to solve complex constraint systems. These performance and algorithmic improvements allow our prototype, FANDANGO-RS, to solve constraints that previous strategies simply cannot handle. We demonstrate this by a case study for a C subset, in which FANDANGO-RS is able to generate 401 diverse, complex, and valid test inputs for a C compiler per minute.