🤖 AI Summary
To address motion blur, defocus, illumination inconsistency, and unstable reconstruction quality inherent in manual multi-view image capture, this paper proposes a lightweight 3D reconstruction framework tailored for turntable-based acquisition. Methodologically, we design a consumer-grade automated turntable system enabling standardized, blur-free acquisition of multi-view, multi-illumination images. We further introduce, for the first time in 3D Gaussian Splatting (3DGS), light-rotation conditional modeling to achieve implicit illumination disentanglement and real-time relighting. Our approach integrates automated turntable control, multi-camera pose sampling, and light-rotation-encoded NeRF-style radiance field representation. Experiments demonstrate that hundreds of high-quality images can be captured within minutes; the resulting reconstructions match the geometric fidelity of densely sampled manual captures while enabling arbitrary light-source relighting and cross-scene illumination harmonization. This significantly enhances robustness and practicality in novel-view synthesis.
📝 Abstract
Novel view synthesis (NVS) from multiple captured photos of an object is a widely studied problem. Achieving high quality typically requires dense sampling of input views, which can lead to frustrating and tedious manual labor. Manually positioning cameras to maintain an optimal desired distribution can be difficult for humans, and if a good distribution is found, it is not easy to replicate. Additionally, the captured data can suffer from motion blur and defocus due to human error. In this paper, we present a lightweight object capture pipeline to reduce the manual workload and standardize the acquisition setup. We use a consumer turntable to carry the object and a tripod to hold the camera. As the turntable rotates, we automatically capture dense samples from various views and lighting conditions; we can repeat this for several camera positions. This way, we can easily capture hundreds of valid images in several minutes without hands-on effort. However, in the object reference frame, the light conditions vary; this is harmful to a standard NVS method like 3D Gaussian splatting (3DGS) which assumes fixed lighting. We design a neural radiance representation conditioned on light rotations, which addresses this issue and allows relightability as an additional benefit. We demonstrate our pipeline using 3DGS as the underlying framework, achieving competitive quality compared to previous methods with exhaustive acquisition and showcasing its potential for relighting and harmonization tasks.