🤖 AI Summary
This work addresses the challenge of acquiring high-quality training data for multi-view stereo tasks, which is often costly and complex. The authors propose SimpleProc, a minimally rule-based procedural method for synthesizing multi-view image pairs through automated pipelines involving NURBS surfaces, procedural geometric modeling, displacement mapping, and texture synthesis. Remarkably, models trained on only 8,000 images generated by SimpleProc outperform those trained on an equivalent amount of real-world data. When scaled to 352,000 synthetic images, the approach matches or even exceeds the performance of models trained on 692,000 carefully curated real images, demonstrating the substantial efficiency and efficacy advantages of rule-driven synthetic data generation for multi-view stereo reconstruction.
📝 Abstract
In this paper, we explore the design space of procedural rules for multi-view stereo (MVS). We demonstrate that we can generate effective training data using SimpleProc: a new, fully procedural generator driven by a very small set of rules using Non-Uniform Rational Basis Splines (NURBS), as well as basic displacement and texture patterns. At a modest scale of 8,000 images, our approach achieves superior results compared to manually curated images (at the same scale) sourced from games and real-world objects. When scaled to 352,000 images, our method yields performance comparable to--and in several benchmarks, exceeding--models trained on over 692,000 manually curated images. The source code and the data are available at https://github.com/princeton-vl/SimpleProc.