🤖 AI Summary
This work addresses the scalability limitations of existing Mixed Boolean-Arithmetic (MBA) expression synthesis methods, which predominantly rely on CPU-based approaches and struggle with large-scale or complex specifications. While prior GPU-accelerated techniques attempt to improve performance, they suffer from cache inefficiencies due to the high-dimensional bit-vector nature of MBA outputs. To overcome this, we propose SIMBA—the first cache-free GPU-accelerated MBA synthesizer—that employs a cache-oblivious, bottom-up enumeration strategy tailored for GPU architectures. By preserving computational locality while enabling massive parallelism, SIMBA breaks away from the conventional paradigm of behavior-equivalence caching. Our approach achieves substantial gains in both synthesis efficiency and scalability, significantly outperforming state-of-the-art tools and successfully synthesizing complex MBA expressions previously deemed unsolvable.
📝 Abstract
Synthesizing Mixed-Boolean Arithmetic (MBA) expressions from input-output examples is central to program deobfuscation and also useful for compiler optimization, reverse engineering, and cryptanalysis. Existing MBA synthesizers are typically CPU-based and scale poorly on large specifications or complex targets. Recent GPU-accelerated synthesis methods achieve large speedups in qualitative settings, but they depend on caching observationally equivalent candidates; this strategy breaks down for MBA because candidate outputs are quantitative bitvectors and the behavioral space is enormous. We present SIMBA (Synthesis of Mixed-Boolean Arithmetic), a GPU-accelerated MBA synthesizer built around cache-free bottom-up enumeration. SIMBA avoids language caches entirely and uses a GPU-oriented enumeration design that keeps work local and highly parallel. In experiments, SIMBA is substantially faster than prior MBA synthesis tools, handles larger specifications, and reaches expression sizes that existing methods fail to solve. These results establish cache-free GPU synthesis as a practical and scalable approach for quantitative domains, and identify it as a strong alternative to cache-centric designs.