Finding NeMO: A Geometry-Aware Representation of Template Views for Few-Shot Perception

๐Ÿ“… 2026-02-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of detecting, segmenting, and estimating the 6DoF pose of previously unseen objects given only a few RGB template views. To this end, the authors propose NeMO (Neural Memory Object), a novel object representation that encodes sparse template views into a learnable sparse point cloud embedding both semantic and geometric priors. A multi-task decoder then leverages this unified NeMO representation to perform various dense prediction tasksโ€”without requiring camera parameters or retraining on target data. The key innovation lies in externalizing object priors into a shared NeMO format, enabling single-network multi-task inference and rapid deployment for new objects. Experiments demonstrate that the method achieves state-of-the-art or competitive performance across multiple datasets in the BOP benchmark, significantly enhancing system scalability and efficiency.

Technology Category

Application Category

๐Ÿ“ Abstract
We present Neural Memory Object (NeMO), a novel object-centric representation that can be used to detect, segment and estimate the 6DoF pose of objects unseen during training using RGB images. Our method consists of an encoder that requires only a few RGB template views depicting an object to generate a sparse object-like point cloud using a learned UDF containing semantic and geometric information. Next, a decoder takes the object encoding together with a query image to generate a variety of dense predictions. Through extensive experiments, we show that our method can be used for few-shot object perception without requiring any camera-specific parameters or retraining on target data. Our proposed concept of outsourcing object information in a NeMO and using a single network for multiple perception tasks enhances interaction with novel objects, improving scalability and efficiency by enabling quick object onboarding without retraining or extensive pre-processing. We report competitive and state-of-the-art results on various datasets and perception tasks of the BOP benchmark, demonstrating the versatility of our approach. https://github.com/DLR-RM/nemo
Problem

Research questions and friction points this paper is trying to address.

few-shot perception
6DoF pose estimation
object detection
object segmentation
novel object understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural Memory Object
few-shot perception
geometry-aware representation
6DoF pose estimation
object-centric representation
๐Ÿ”Ž Similar Papers
No similar papers found.
S
Sebastian Jung
German Aerospace Center (DLR)
L
Leonard Klupfel
German Aerospace Center (DLR)
Rudolph Triebel
Rudolph Triebel
German Aerospace Center (DLR) and Karlsruhe Institute of Technology (KIT)
RoboticsMachine LearningPerceptionComputer Vision
M
M. Durner
German Aerospace Center (DLR)