SegMASt3R: Geometry Grounded Segment Matching

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the challenging problem of segment matching across wide-baseline images—particularly under extreme viewpoint variations (up to 180°), occlusion, and illumination changes. To this end, it is the first to introduce geometric inductive biases from 3D foundation models into segment matching. Methodologically, it integrates 3D spatial reasoning with SAM2’s segmentation priors to jointly optimize local feature matching, thereby establishing cross-image region correspondences that are both semantically coherent and geometrically consistent. The core contribution lies in explicitly modeling scene geometry via 3D representations, which substantially enhances matching robustness under large viewpoint shifts. On ScanNet++ and Replica benchmarks, the method achieves a 30% improvement in AUPRC over prior state-of-the-art approaches. Furthermore, it consistently improves downstream 3D instance segmentation and visual navigation performance.

Technology Category

Application Category

📝 Abstract

Segment matching is an important intermediate task in computer vision that establishes correspondences between semantically or geometrically coherent regions across images. Unlike keypoint matching, which focuses on localized features, segment matching captures structured regions, offering greater robustness to occlusions, lighting variations, and viewpoint changes. In this paper, we leverage the spatial understanding of 3D foundation models to tackle wide-baseline segment matching, a challenging setting involving extreme viewpoint shifts. We propose an architecture that uses the inductive bias of these 3D foundation models to match segments across image pairs with up to 180 degree view-point change. Extensive experiments show that our approach outperforms state-of-the-art methods, including the SAM2 video propagator and local feature matching methods, by upto 30% on the AUPRC metric, on ScanNet++ and Replica datasets. We further demonstrate benefits of the proposed model on relevant downstream tasks, including 3D instance segmentation and image-goal navigation. Project Page: https://segmast3r.github.io/

Problem

Research questions and friction points this paper is trying to address.

Matching image segments under extreme viewpoint changes

Leveraging 3D foundation models for wide-baseline matching

Improving robustness over existing segmentation and matching methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 3D foundation models for segment matching

Matches segments across 180-degree viewpoint changes

Outperforms SAM2 and local feature methods

🔎 Similar Papers

Searching from Area to Point: A Hierarchical Framework for Semantic-Geometric Combined Feature Matching