Equivalence Checking of ML GPU Kernels

📅 2025-11-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Optimized GPU kernels for ML workloads—such as convolution, GEMM, and attention—are increasingly generated via manual tuning, compiler autotuning, or LLM-based synthesis; however, existing approaches lack formal guarantees of functional equivalence between original and optimized variants. Method: This paper introduces VOLTA, the first formal equivalence verification framework for GPU kernels. VOLTA constructs precise semantic models of GPU kernels and leverages sound program-equivalence checking algorithms to ensure theoretical completeness and correctness. Contribution/Results: VOLTA supports mainstream deep learning and large language model computation patterns, enabling fully automated verification of functional consistency across diverse optimization sources. Experimental evaluation demonstrates that VOLTA efficiently detects multiple previously unknown semantic inconsistencies in state-of-the-art optimized kernels, with high reliability. By providing a rigorous formal foundation, VOLTA advances trustworthiness in low-level operator optimization for AI systems.

Technology Category

Application Category

📝 Abstract
With the rapid progress of deep learning and large language models (LLMs), companies now spend enormous sums executing GPU kernels. These kernels have, therefore, become prime targets for aggressive optimization. Recent efforts increasingly leverage LLMs to generate GPU kernels, but make no formal guarantees about the generated kernels. We present the first equivalence checker for GPU kernels and use it to formally verify the correctness of machine learning (ML) kernels optimized by hand, by LLMs, and by compilers. We show that our equivalence checker is sound and, for a well-defined class of GPU kernels which includes the programs of interest, complete. Our implementation, VOLTA, can verify ML computations such as convolutions, matrix multiplications, and various attention mechanisms.
Problem

Research questions and friction points this paper is trying to address.

Verifying equivalence of optimized ML GPU kernels
Ensuring correctness across manual and automated optimizations
Providing formal guarantees for kernel transformations
Innovation

Methods, ideas, or system contributions that make the work stand out.

First equivalence checker for GPU kernels
Formally verifies correctness of ML kernels
Sound and complete for specific kernel classes