SPADA: A Spatial Dataflow Architecture Programming Language

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

Programming spatial dataflow architectures—such as the Cerebras Wafer-Scale Engine (WSE)—faces challenges in coordinating data movement and triggering asynchronous computation, while existing FPGA/CGRA abstractions fail to accommodate their regular grid-based dataflows and complex on-chip routing. To address this, we propose SPADA, a domain-specific language for spatial dataflow programming. Our approach introduces: (1) a formal dataflow semantics framework ensuring routing correctness and eliminating data races and deadlocks; (2) fine-grained abstractions for data layout, streaming patterns, and asynchronous operations, while abstracting away hardware details; and (3) a multi-level lowering compilation pipeline compatible with high-level DSLs such as GT4Py. Evaluated on the Cerebras CSL platform, SPADA achieves near-ideal weak scaling, reduces code size by 6–8×, and significantly improves developer productivity and system programmability.

Technology Category

Application Category

📝 Abstract

Spatial dataflow architectures like the Cerebras Wafer-Scale Engine achieve exceptional performance in AI and scientific applications by leveraging distributed memory across processing elements (PEs) and localized computation. However, programming these architectures remains challenging due to the need for explicit orchestration of data movement through reconfigurable networks-on-chip and asynchronous computation triggered by data arrival. Existing FPGA and CGRA programming models emphasize loop scheduling but overlook the unique capabilities of spatial dataflow architectures, particularly efficient dataflow over regular grids and intricate routing management. We present SPADA, a programming language that provides precise control over data placement, dataflow patterns, and asynchronous operations while abstracting architecture-specific low-level details. We introduce a rigorous dataflow semantics framework for SPADA that defines routing correctness, data races, and deadlocks. Additionally, we design and implement a compiler targeting Cerebras CSL with multi-level lowering. SPADA serves as both a high-level programming interface and an intermediate representation for domain-specific languages (DSLs), which we demonstrate with the GT4Py stencil DSL. SPADA enables developers to express complex parallel patterns -- including pipelined reductions and multi-dimensional stencils -- in 6--8x less code than CSL with near-ideal weak scaling across three orders of magnitude. By unifying programming for spatial dataflow architectures under a single model, SPADA advances both the theoretical foundations and practical usability of these emerging high-performance computing platforms.

Problem

Research questions and friction points this paper is trying to address.

Programming spatial dataflow architectures requires explicit data movement orchestration

Existing models overlook efficient dataflow routing and asynchronous computation management

Complex parallel patterns need simplified expression while maintaining performance scaling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Programming language for spatial dataflow architectures

Abstracts low-level details with precise dataflow control

Compiler targeting Cerebras with multi-level lowering

🔎 Similar Papers

No similar papers found.