🤖 AI Summary
To address the high inference latency and power consumption of Neural Radiance Fields (NeRF) in real-time rendering, this paper proposes ASDR, an algorithm-architecture co-designed accelerator. Methodologically: (i) a pixel-wise rendering difficulty–aware dynamic sampling strategy is introduced to adaptively reduce redundant ray sampling; (ii) the color and density prediction pathways are decoupled to lower MLP computational overhead; and (iii) a ReRAM-based Compute-in-Memory (CIM) architecture is developed, incorporating optimized data mapping and fine-grained data reuse microstructures to enhance energy efficiency. Experimental results demonstrate that ASDR achieves 9.55× and 69.75× speedup over the state-of-the-art NeRF accelerators and the Jetson Xavier NX GPU, respectively, with only a marginal 0.1 dB PSNR degradation. This yields substantial improvements in both real-time rendering capability and energy efficiency.
📝 Abstract
Neural Radiance Fields (NeRF) offer significant promise for generating photorealistic images and videos. However, existing mainstream neural rendering models often fall short in meeting the demands for immediacy and power efficiency in practical applications. Specifically, these models frequently exhibit irregular access patterns and substantial computational overhead, leading to undesirable inference latency and high power consumption. Computing-in-memory (CIM), an emerging computational paradigm, has the potential to address these access bottlenecks and reduce the power consumption associated with model execution.
To bridge the gap between model performance and real-world scene requirements, we propose an algorithm-architecture co-design approach, abbreviated as ASDR, a CIM-based accelerator supporting efficient neural rendering. At the algorithmic level, we propose two rendering optimization schemes: (1) Dynamic sampling by online sensing of the rendering difficulty of different pixels, thus reducing access memory and computational overhead. (2) Reducing MLP overhead by decoupling and approximating the volume rendering of color and density. At the architecture level, we design an efficient ReRAM-based CIM architecture with efficient data mapping and reuse microarchitecture. Experiments demonstrate that our design can achieve up to $9.55 imes$ and $69.75 imes$ speedup over state-of-the-art NeRF accelerators and Xavier NX GPU in graphics rendering tasks with only $0.1$ PSNR loss.