Radiant Foam Rendering on a Graph Processor

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

This work addresses the challenge of adapting traditional volume rendering—typically reliant on a unified large memory—to multi-core accelerators equipped with distributed local SRAM. For the first time, it demonstrates fully on-chip SRAM-based Radiant Foam volume rendering on the Graphcore Mk2 IPU, introducing a scene tiling strategy coupled with a hierarchical ray routing mechanism. By integrating explicit inter-tile communication and intra-tile scheduling optimizations, the entire ray marching process is confined to on-chip SRAM with predictable communication patterns. The approach effectively supports irregular, high-data-movement rendering workloads, achieving near-interactive frame rates of approximately 1 fps at 640×480 resolution on Mip-NeRF 360 scenes, while maintaining image and depth quality comparable to the original GPU implementation—all data remains strictly within on-chip SRAM throughout execution.

Technology Category

Application Category

📝 Abstract

Many emerging many-core accelerators replace a single large device memory with hundreds to thousands of lightweight cores, each owning only a small local SRAM and exchanging data via explicit on-chip communication. This organization offers high aggregate bandwidth, but it breaks a key assumption behind many volumetric rendering techniques: that rays can randomly access a large, unified scene representation. Rendering efficiently on such hardware therefore requires distributing both data and computation, keeping ray traversal mostly local, and structuring communication into predictable routes. We present a fully in-SRAM, distributed renderer for the Radiant Foam Voronoi-cell volumetric representation on the Graphcore Mk2 IPU(Intelligence Processing Unit), a many-core accelerator with tile-local SRAM and explicit inter-tile communication. Our system shards the scene across tiles and forwards rays between shards through a hierarchical routing overlay, enabling ray marching entirely from on-chip SRAM with predictable communication. On Mip-NeRF~360 scenes, the system attains near-interactive throughput of approximately 1 fps at 640x480 with image and depth map quality close to the original GPU-based Radiant Foam implementation, while keeping all scene data and ray state in on-chip SRAM. Beyond demonstrating feasibility, we analyze routing, memory, and scheduling bottlenecks that inform how future distributed-memory accelerators can better support irregular, data-movement-heavy rendering workloads.

Problem

Research questions and friction points this paper is trying to address.

volumetric rendering

distributed memory

many-core accelerator

in-SRAM computation

ray marching

Innovation

Methods, ideas, or system contributions that make the work stand out.

in-SRAM rendering

distributed ray tracing

Radiant Foam