LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos

📅 2025-08-19

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the challenging novel view synthesis (NVS) problem for long, unstructured videos captured by non-professional users—characterized by irregular camera motion, unknown poses, and large-scale, complex scenes. We propose a calibration-free, incremental joint optimization framework that: (1) simultaneously optimizes camera poses and 3D Gaussian splatting representations; (2) incorporates a learned 3D geometric prior for robust pose initialization; and (3) introduces a spatial-density-aware octree-based anchor construction mechanism to enable efficient organization and rendering of massive point clouds. Evaluated on multiple challenging long-video benchmarks, our method achieves state-of-the-art performance in rendering quality, pose accuracy, and computational efficiency. To the best of our knowledge, it is the first approach to enable high-fidelity, long-duration NVS without any auxiliary information (e.g., IMU data, depth sensors, or pre-calibrated cameras).

Technology Category

Application Category

📝 Abstract

LongSplat addresses critical challenges in novel view synthesis (NVS) from casually captured long videos characterized by irregular camera motion, unknown camera poses, and expansive scenes. Current methods often suffer from pose drift, inaccurate geometry initialization, and severe memory limitations. To address these issues, we introduce LongSplat, a robust unposed 3D Gaussian Splatting framework featuring: (1) Incremental Joint Optimization that concurrently optimizes camera poses and 3D Gaussians to avoid local minima and ensure global consistency; (2) a robust Pose Estimation Module leveraging learned 3D priors; and (3) an efficient Octree Anchor Formation mechanism that converts dense point clouds into anchors based on spatial density. Extensive experiments on challenging benchmarks demonstrate that LongSplat achieves state-of-the-art results, substantially improving rendering quality, pose accuracy, and computational efficiency compared to prior approaches. Project page: https://linjohnss.github.io/longsplat/

Problem

Research questions and friction points this paper is trying to address.

Addresses novel view synthesis from unposed long videos

Solves pose drift and inaccurate geometry initialization issues

Overcomes severe memory limitations in expansive scenes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Incremental Joint Optimization for pose and Gaussians

Robust Pose Estimation using learned 3D priors

Efficient Octree Anchor Formation from point clouds

🔎 Similar Papers

No similar papers found.