LLM-Powered Code Analysis and Optimization for Gaussian Splatting Kernels

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

253K/year

🤖 AI Summary

Manual GPU kernel optimization for 3D Gaussian Splatting (3DGS) is time-consuming, expertise-intensive, and suffers from an exponentially large search space. Method: This work pioneers the integration of large language models (LLMs)—including DeepSeek and GPT-5—into real-world, domain-specific CUDA kernel optimization. We propose a human-in-the-loop framework that iteratively analyzes and refactors CUDA kernels for 3DGS and the Seele rendering framework, guided by LLM reasoning and profiler-derived runtime feedback. Contribution/Results: On the MipNeRF360 dataset, our approach achieves an average speedup of 38% (up to 42%), approaching hand-tuned performance; it further improves the already highly optimized Seele framework by 6%. This study delineates the practical capabilities and limitations of LLMs in low-level code-level performance tuning, validates their viability in high-performance computing contexts, and uncovers previously unexplored optimization opportunities beyond expert knowledge.

Technology Category

Application Category

📝 Abstract

3D Gaussian splatting (3DGS) is a transformative technique with profound implications on novel view synthesis and real-time rendering. Given its importance, there have been many attempts to improve its performance. However, with the increasing complexity of GPU architectures and the vast search space of performance-tuning parameters, it is a challenging task. Although manual optimizations have achieved remarkable speedups, they require domain expertise and the optimization process can be highly time consuming and error prone. In this paper, we propose to exploit large language models (LLMs) to analyze and optimize Gaussian splatting kernels. To our knowledge, this is the first work to use LLMs to optimize highly specialized real-world GPU kernels. We reveal the intricacies of using LLMs for code optimization and analyze the code optimization techniques from the LLMs. We also propose ways to collaborate with LLMs to further leverage their capabilities. For the original 3DGS code on the MipNeRF360 datasets, LLMs achieve significant speedups, 19% with Deepseek and 24% with GPT-5, demonstrating the different capabilities of different LLMs. By feeding additional information from performance profilers, the performance improvement from LLM-optimized code is enhanced to up to 42% and 38% on average. In comparison, our best-effort manually optimized version can achieve a performance improvement up to 48% and 39% on average, showing that there are still optimizations beyond the capabilities of current LLMs. On the other hand, even upon a newly proposed 3DGS framework with algorithmic optimizations, Seele, LLMs can still further enhance its performance by 6%, showing that there are optimization opportunities missed by domain experts. This highlights the potential of collaboration between domain experts and LLMs.

Problem

Research questions and friction points this paper is trying to address.

Optimizing Gaussian splatting kernels using large language models

Automating GPU code analysis to reduce manual optimization efforts

Enhancing 3DGS performance through LLM-driven code transformations

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs optimize Gaussian splatting GPU kernels

Performance profilers enhance LLM optimization effectiveness

LLMs collaborate with experts to uncover missed optimizations

🔎 Similar Papers

Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models

2023-08-21arXiv.orgCitations: 17

An Empirical Study of Automated Vulnerability Localization with Large Language Models

2024-03-30arXiv.orgCitations: 25

💼 Related Jobs

No related jobs found.

Authors to Follow