Adaptation of AI-accelerated CFD Simulations to the IPU platform

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Traditional computational fluid dynamics (CFD) simulations incur high computational costs, necessitating efficient AI-based acceleration strategies. However, existing approaches remain under-optimized for emerging hardware such as the Intelligence Processing Unit (IPU). This work presents the first systematic evaluation of IPU suitability for AI-driven simulation, porting a TensorFlow-based CFD surrogate model to an IPU-POD16 platform. Leveraging the Poplar SDK and popdist library, we introduce optimized data loading and parallel training strategies tailored for multi-IPU execution. Experimental results demonstrate a 34% speedup from mitigating single-node data loading bottlenecks. Furthermore, scaling from 2 to 16 IPUs increases throughput from 560.8 to 2805.8 samples per second, exhibiting strong scalability and highlighting the critical role of inter-IPU communication in multi-device performance.

📝 Abstract

Intelligence Processing Units (IPU) have proven useful for many AI applications. In this paper, we evaluate them within the emerging field of \emph{AI for simulation}, where traditional numerical simulations are supported by artificial intelligence approaches. We focus specifically on a program for training machine learning models supporting a \emph{computational fluid dynamics} application. We use custom TensorFlow provided by the Poplar SDK to adapt the program for the IPU-POD16 platform and investigate its ease of use and performance scalability. Training a model on data from OpenFOAM simulations allows us to get accurate simulation state predictions in test time. We show how to utilize the \emph{popdist} library to overcome a performance bottleneck in feeding training data to the IPU on the host side, achieving up to 34\% speedup. Due to communication overheads, using data parallelism to utilize two IPUs instead of one does not improve the throughput. However, once the intra-IPU costs have been paid, the hardware capabilities for inter-IPU communication allow for good scalability. Increasing the number of IPUs from 2 to 16 improves the throughput from 560.8 to 2805.8 samples/s.

Problem

Research questions and friction points this paper is trying to address.

AI-accelerated CFD

IPU platform

performance scalability

data feeding bottleneck

computational fluid dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

IPU

AI-accelerated CFD

popdist