🤖 AI Summary
Large language models (LLMs) generate scientific computing code—such as Fast Fourier Transform (FFT) implementations—lacking formal guarantees on numerical stability, floating-point precision, and algorithmic correctness.
Method: This paper proposes a stepwise semantic lifting approach grounded in an extended SPIRAL framework, integrating symbolic execution with interactive theorem proving (Coq/Isabelle) to systematically lift LLM-generated floating-point code to high-level, mathematically precise specifications. The method explicitly encodes FFT domain knowledge and constraints for faithful floating-point arithmetic semantics.
Contribution/Results: It introduces the first semantic lifting pathway tailored to scientific computing kernels. Experiments demonstrate end-to-end lifting of GPT-generated FFT code into verifiable mathematical specifications, with formal verification of both algorithmic equivalence and numerical stability. This establishes a novel paradigm for formally certifying AI-generated scientific software.
📝 Abstract
The rise of automated code generation tools, such as large language models (LLMs), has introduced new challenges in ensuring the correctness and efficiency of scientific software, particularly in complex kernels, where numerical stability, domain-specific optimizations, and precise floating-point arithmetic are critical. We propose a stepwise semantics lifting approach using an extended SPIRAL framework with symbolic execution and theorem proving to statically derive high-level code semantics from LLM-generated kernels. This method establishes a structured path for verifying the source code's correctness via a step-by-step lifting procedure to high-level specification. We conducted preliminary tests on the feasibility of this approach by successfully lifting GPT-generated fast Fourier transform code to high-level specifications.