🤖 AI Summary
Existing compiler testing tools, such as Csmith, struggle to adapt to MLIR’s extensible architecture and lack systematic approaches for effectively testing MLIR and its downstream compilers—including LLVM, DaCe, and DCIR. This work presents the first randomized program generation technique tailored for MLIR, integrated with differential testing to validate consistency across multiple compiler pipelines. By leveraging MLIR’s modular design, the method supports custom dialects and transformations, thereby addressing a critical gap in automated testing within the MLIR ecosystem. Experimental evaluation demonstrates the approach’s effectiveness and practicality, having successfully uncovered several optimization-related bugs across diverse compiler backends, thus enhancing overall compiler reliability.
📝 Abstract
Compilers are essential for the performance and correct execution of software and hold universal relevance across various scientific disciplines. Despite this, there is a notable lack of tools for testing and evaluating them, especially within the adaptable Multi-Level Intermediate Representation (MLIR) context. This paper addresses the need for a tool that can accommodate MLIR's extensibility, a feature not provided by previous methods such as Csmith. Here we introduce MLIR-Smith, a novel random program generator specifically designed to test and evaluate MLIR-based compiler optimizations. We demonstrate the utility of MLIR-Smith by conducting differential testing on MLIR, LLVM, DaCe, and DCIR, which led to the discovery of multiple bugs in these compiler pipelines. The introduction of MLIR-Smith not only fills a void in the realm of compiler testing but also emphasizes the importance of comprehensive testing within these systems. By providing a tool that can generate random MLIR programs, this paper enhances our ability to evaluate and improve compilers and paves the way for future tools, potentially shaping the wider landscape of software testing and quality assurance.