FlexNeRFer: A Multi-Dataflow, Adaptive Sparsity-Aware Accelerator for On-Device NeRF Rendering

πŸ“… 2025-05-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the high GPU power consumption, large area overhead, and poor generalizability of existing accelerators for real-time neural radiance field (NeRF) rendering on edge devices, this work proposes a high-energy-efficiency, multi-model-compatible hardware accelerator. Our method introduces three key innovations: (1) a multiply-accumulate (MAC) array supporting concurrent data streams and configurable precision; (2) a flexible network-on-chip (NoC)-based interconnection architecture; and (3) a sparsity-aware storage format enabling adaptive selection of sparsity and precision modes. Implemented in 28 nm CMOS technology, the accelerator achieves 8.2–243.3Γ— speedup and 24.1–520.3Γ— energy efficiency improvement over the RTX 2080 Ti, and 4.2–86.9Γ— speedup and 2.3–47.5Γ— energy efficiency gain over NeuRexβ€”across diverse NeRF models and rendering configurations.

Technology Category

Application Category

πŸ“ Abstract
Neural Radiance Fields (NeRF), an AI-driven approach for 3D view reconstruction, has demonstrated impressive performance, sparking active research across fields. As a result, a range of advanced NeRF models has emerged, leading on-device applications to increasingly adopt NeRF for highly realistic scene reconstructions. With the advent of diverse NeRF models, NeRF-based applications leverage a variety of NeRF frameworks, creating the need for hardware capable of efficiently supporting these models. However, GPUs fail to meet the performance, power, and area (PPA) cost demanded by these on-device applications, or are specialized for specific NeRF algorithms, resulting in lower efficiency when applied to other NeRF models. To address this limitation, in this work, we introduce FlexNeRFer, an energy-efficient versatile NeRF accelerator. The key components enabling the enhancement of FlexNeRFer include: i) a flexible network-on-chip (NoC) supporting multi-dataflow and sparsity on precision-scalable MAC array, and ii) efficient data storage using an optimal sparsity format based on the sparsity ratio and precision modes. To evaluate the effectiveness of FlexNeRFer, we performed a layout implementation using 28nm CMOS technology. Our evaluation shows that FlexNeRFer achieves 8.2~243.3x speedup and 24.1~520.3x improvement in energy efficiency over a GPU (i.e., NVIDIA RTX 2080 Ti), while demonstrating 4.2~86.9x speedup and 2.3~47.5x improvement in energy efficiency compared to a state-of-the-art NeRF accelerator (i.e., NeuRex).
Problem

Research questions and friction points this paper is trying to address.

Efficient hardware support for diverse NeRF models
Overcoming GPU limitations in performance and energy efficiency
Optimizing data storage and processing for NeRF acceleration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flexible NoC supporting multi-dataflow and sparsity
Optimal sparsity format for efficient data storage
Precision-scalable MAC array for versatile NeRF acceleration
πŸ”Ž Similar Papers
2023-12-19International Conference on 3D VisionCitations: 2
2023-04-24International Symposium on Computer ArchitectureCitations: 11
2023-04-24International Symposium on Computer ArchitectureCitations: 47
S
Seock-Hwan Noh
DGIST, Daegu, Republic of Korea
B
Banseok Shin
Samsung Electronics, Suwon, Republic of Korea
J
Jeik Choi
DEEPX, Seongnam, Republic of Korea
S
Seungpyo Lee
Fitogether, Seoul, Republic of Korea
Jaeha Kung
Jaeha Kung
Associate Professor, Korea University
Accelerator DesignApproximate ComputingML ArchitectureVLSI
Yeseong Kim
Yeseong Kim
Associate and Distinguished Professor, DGIST
Brain-inspired HD ComputingLightweight AISystem/Architecture Design for AI and IoT ecosystems