City-Mesh3R: Simulation-Ready City-Scale 3D Mesh Reconstruction from Multi-View Images

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing city-scale 3D reconstruction methods struggle to produce complete, watertight, and geometrically regular meshes suitable for simulation. To address this limitation, this work proposes a scalable end-to-end image-to-mesh reconstruction framework that adopts a divide-and-conquer strategy: it avoids global feature matching through topological image clustering and spatial partitioning, enabling distributed reconstruction via block-wise sparse structure-from-motion (SfM) and geometry-aware camera selection. Furthermore, the method introduces a curvature-adaptive remeshing technique that dynamically adjusts vertex density to preserve fine details while enhancing geometric regularity. Notably, the approach operates without global optimization, efficiently generating high-fidelity, watertight, detail-rich, and topologically consistent 3D meshes across city-scale scenes.
📝 Abstract
City-scale 3D surface reconstruction from multiview images for downstream 3D simulation, poses highly challenging problems due to the scale and complexity of urban scenes. Existing city-scale 3D reconstruction methods based on NeRF, Gaussian Splatting etc. often fail to recover 3D meshes ready for simulation due to incomplete/missing geometry and irregular, noisy surfaces. Scaling existing small-scale 3D reconstruction methods to arbitrarily large urban scenes is highly infeasible due to their computational complexity. We present City-Mesh3R, a scalable framework for reconstructing watertight surface meshes directly from large unordered image collections. Unlike recent methods which use global sparse SfM point-cloud initialization followed by a distributed 3D dense reconstruction of large-scale scenes, our method follows an end-to-end images-to-mesh 3D reconstruction approach using a divide-and-conquer strategy. The sparse city map is reconstructed via topological image clustering, cluster-wise independent sparse SfM and map merging, without need for exhaustive image feature matching. Then this map is partitioned spatially to perform geometry-aware camera selection, followed by dense surface reconstruction and surface refinement using curvature-aware adaptive vertex density remeshing. These partition meshes are then stitched together to produce the global mesh of the city. The proposed end-to-end framework is evaluated on city-scale reconstruction datasets. As demonstrated by our qualitative and quantitative results, our proposed method yields high-fidelity watertight 3D meshes with regular geometry, capturing fine surface details, and is suitable for scaling to arbitrarily large scenes owing to the end-to-end processing in a distributed setting.
Problem

Research questions and friction points this paper is trying to address.

city-scale 3D reconstruction
simulation-ready mesh
multi-view images
watertight surface
urban scene modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

watertight mesh
city-scale reconstruction
divide-and-conquer
curvature-aware remeshing
simulation-ready 3D
Sayan Paul
Sayan Paul
Visual Computing and Embodied AI Lab, TCS Research
RoboticsComputer VisionArtificial IntelligenceMachine Learning
S
Sourav Ghosh
Visual Computing & Embodied AI Lab, TCS Research, India
Siddharth Katageri
Siddharth Katageri
MS by Research at IIIT Hyderabad
Geometric deep learning3D Computer VisionRepresentation Learning
S
Soumyadip Maity
Visual Computing & Embodied AI Lab, TCS Research, India
Sanjana Sinha
Sanjana Sinha
TCS Research
Computer VisionImage ProcessingMachine Learning
B
Brojeshwar Bhowmick
Visual Computing & Embodied AI Lab, TCS Research, India