🤖 AI Summary
This work addresses the lack of standardized evaluation criteria for uniform random samplers of variability systems encoded as Boolean formulas. We propose the first uniformity testing framework tailored to constrained solution spaces. Methodologically, we design a suite of five statistical tests specifically adapted to Boolean constraint structures, integrating hypothesis testing, SAT solving, model counting, and CSP analysis. We conduct systematic evaluations across seven state-of-the-art samplers, demonstrating the framework’s effectiveness and robustness. Furthermore, we uncover, for the first time, empirical regularities governing how formula-level structural properties—particularly variable dependencies and constraint density—significantly impact sampling uniformity. Our framework establishes a reproducible, interpretable, and principled evaluation benchmark for constraint-guided random sampling, thereby filling a critical methodological gap in the field.
📝 Abstract
Boolean formulae compactly encode huge, constrained search spaces. Thus, variability-intensive systems are often encoded with Boolean formulae. The search space of a variability-intensive system is usually too large to explore without statistical inference (e.g. testing). Testing every valid configuration is computationally expensive (if not impossible) for most systems. This leads most testing approaches to sample a few configurations before analyzing them. A desirable property of such samples is uniformity: Each solution should have the same selection probability. Uniformity is the property that facilitates statistical inference. This property motivated the design of uniform random samplers, relying on SAT solvers and counters and achieving different trade-offs between uniformity and scalability. Though we can observe their performance in practice, judging the quality of the generated samples is different. Assessing the uniformity of a sampler is similar in nature to assessing the uniformity of a pseudo-random number (PRNG) generator. However, sampling is much slower and the nature of sampling also implies that the hyperspace containing the samples is constrained. This means that testing PRNGs is subject to fewer constraints than testing samplers. We propose a framework that contains five statistical tests which are suited to test uniform random samplers. Moreover, we demonstrate their use by testing seven samplers. Finally, we demonstrate the influence of the Boolean formula given as input to the samplers under test on the test results.