GFreeDet: Exploiting Gaussian Splatting and Foundation Models for Model-free Unseen Object Detection in the BOP Challenge 2024

📅 2024-12-02

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

254K/year

🤖 AI Summary

This work addresses the model-free, zero-shot 2D detection of unknown objects in unconstrained scenes. Methodologically, it introduces the first framework integrating Gaussian splatting with vision foundation models (VFMs): geometric modeling is achieved via video-driven Gaussian reconstruction, while semantic feature distillation leverages VFMs such as SAM and CLIP to construct a joint geometric-semantic representation—eliminating reliance on CAD templates or predefined 3D models. The framework enables real-time, category-agnostic object localization from a single reference video alone. Evaluated on the BOP-H3 benchmark, it matches the performance of CAD-based methods; in the BOP Challenge 2024 model-agnostic 2D detection track, it achieves both the highest overall score and the fastest runtime—securing dual first-place awards. This constitutes the first empirical validation of model-free paradigms for practical, real-world 6D pose estimation.

Technology Category

Application Category

📝 Abstract

We present GFreeDet, an unseen object detection approach that leverages Gaussian splatting and vision Foundation models under model-free setting. Unlike existing methods that rely on predefined CAD templates, GFreeDet reconstructs objects directly from reference videos using Gaussian splatting, enabling robust detection of novel objects without prior 3D models. Evaluated on the BOP-H3 benchmark, GFreeDet achieves comparable performance to CAD-based methods, demonstrating the viability of model-free detection for mixed reality (MR) applications. Notably, GFreeDet won the best overall method and the best fast method awards in the model-free 2D detection track at BOP Challenge 2024.

Problem

Research questions and friction points this paper is trying to address.

Detects unseen objects without predefined CAD models

Uses Gaussian splatting and foundation models

Achieves robust performance in mixed reality applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Gaussian splatting for object reconstruction

Leverages vision Foundation models

Detects unseen objects without CAD templates

🔎 Similar Papers

Unsupervised Collaborative Metric Learning with Mixed-Scale Groups for General Object Retrieval