ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation

📅 2025-12-10

📈 Citations: 0

✨ Influential: 0

career value

145K/year

🤖 AI Summary

Existing category-agnostic 3D instance segmentation methods suffer from limited generalization due to scarce real-world annotations and noisy 2D priors; meanwhile, mainstream 3D synthetic data fail to simultaneously ensure geometric diversity, contextual complexity, and layout plausibility. This paper introduces the first data synthesis framework explicitly designed for category-agnostic segmentation. Leveraging a heterogeneous CAD asset library, it integrates large language model–driven spatial layout reasoning, depth-first search–based layout optimization, and multi-view RGB-D rendering with point cloud fusion to generate high-fidelity, diverse, and semantically plausible synthetic scenes. Evaluated on ScanNetV2, ScanNet++, and S3DIS, our synthesized data significantly boosts zero-shot generalization performance of state-of-the-art models—including Mask3D and OpenScene—outperforming prior synthetic approaches. Results empirically validate the critical role of structured, layout-aware synthetic data in advancing category-agnostic 3D instance segmentation.

Technology Category

Application Category

📝 Abstract

Class-agnostic 3D instance segmentation tackles the challenging task of segmenting all object instances, including previously unseen ones, without semantic class reliance. Current methods struggle with generalization due to the scarce annotated 3D scene data or noisy 2D segmentations. While synthetic data generation offers a promising solution, existing 3D scene synthesis methods fail to simultaneously satisfy geometry diversity, context complexity, and layout reasonability, each essential for this task. To address these needs, we propose an Adapted 3D Scene Synthesis pipeline for class-agnostic 3D Instance SegmenTation, termed as ASSIST-3D, to synthesize proper data for model generalization enhancement. Specifically, ASSIST-3D features three key innovations, including 1) Heterogeneous Object Selection from extensive 3D CAD asset collections, incorporating randomness in object sampling to maximize geometric and contextual diversity; 2) Scene Layout Generation through LLM-guided spatial reasoning combined with depth-first search for reasonable object placements; and 3) Realistic Point Cloud Construction via multi-view RGB-D image rendering and fusion from the synthetic scenes, closely mimicking real-world sensor data acquisition. Experiments on ScanNetV2, ScanNet++, and S3DIS benchmarks demonstrate that models trained with ASSIST-3D-generated data significantly outperform existing methods. Further comparisons underscore the superiority of our purpose-built pipeline over existing 3D scene synthesis approaches.

Problem

Research questions and friction points this paper is trying to address.

Generates synthetic 3D scenes for class-agnostic instance segmentation

Addresses limited and noisy training data for 3D segmentation models

Enhances model generalization with diverse and realistic synthetic data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous object selection from CAD collections for diversity

LLM-guided layout generation with depth-first search for placements

Multi-view RGB-D rendering and fusion for realistic point clouds

🔎 Similar Papers

No similar papers found.

Bosch Group

Attraktive Vergütung

Horb am Neckar, BW, DE

PhD - Effiziente Neuronale Repräsentation von Datensätzen

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)