🤖 AI Summary
Existing WebAssembly (Wasm) compilers face a trade-off: domain-specific approaches lack reusability, while general-purpose ones—especially those relying on LLVM IR—lose high-level semantics, hindering efficient support for advanced features such as garbage collection (GC) and stack switching.
Method: This paper introduces the first MLIR-native compilation pipeline tailored for Wasm, featuring a dedicated Wasm dialect family (WasmDialect) that preserves high-level semantics end-to-end across multiple IR levels. It proposes a novel, pattern-based modular extension mechanism in MLIR and is the first to fully support stack-switching compilation within MLIR. The pipeline integrates the Wabt backend to bypass LLVM’s lowering overhead.
Results: Evaluated on PolyBench, it achieves performance comparable to the LLVM backend (within −7.7% slowdown, sometimes faster), generates smaller code size, and enables GC and stack switching with zero additional engineering effort.
📝 Abstract
WebAssembly (Wasm) is a portable bytecode format that serves as a compilation target for high-level languages, enabling their secure and efficient execution across diverse platforms, including web browsers and embedded systems. To improve support for high-level languages without incurring significant code size or performance overheads, Wasm continuously evolves by integrating high-level features such as Garbage Collection and Stack Switching. However, existing compilation approaches either lack reusable design -- requiring redundant implementation efforts for each language -- or lose abstraction by lowering high-level constructs into low-level shared representations like LLVM IR, which hinder the adoption of high-level features. MLIR compiler infrastructure provides the compilation pipeline with multiple levels of abstraction, preserving high-level abstractions throughout the compilation pipeline, yet the current MLIR pipeline relies on the LLVM backend for Wasm code generation, thereby inheriting LLVM's limitations. This paper presents a novel compilation pipeline for Wasm, featuring Wasm dialects explicitly designed to represent high-level Wasm constructs within MLIR. Our approach enables direct generation of high-level Wasm code from corresponding high-level MLIR dialects without losing abstraction, providing a modular and extensible way to incorporate high-level Wasm features. We illustrate this extensibility through a case study that leverages Stack Switching, a recently introduced high-level feature of Wasm. Performance evaluations on PolyBench benchmarks show that our pipeline, benefiting from optimizations within the MLIR and Wasm ecosystems, produces code with at most 7.7% slower, and faster in some execution environments, compared to LLVM-based compilers.