When Slots Compete: Slot Merging in Object-Centric Learning

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge in object-centric learning where a fixed number of slots often leads to multiple slots competing for the same entity due to overlapping regions, hindering effective object separation. To resolve this, the authors propose a lightweight slot-merging mechanism that requires no additional learnable components. By leveraging a Soft-IoU metric to dynamically assess the overlap between slot attention maps, redundant slots are adaptively merged, while a centroid-based update strategy preserves gradient flow during training. This approach enables adaptive slot pruning without compromising representational capacity. Seamlessly integrated into the DINOSAUR framework, the method significantly outperforms existing adaptive approaches on object discovery and segmentation benchmarks, yielding improved mask quality and enhanced object factorization performance.

Technology Category

Application Category

📝 Abstract
Slot-based object-centric learning represents an image as a set of latent slots with a decoder that combines them into an image or features. The decoder specifies how slots are combined into an output, but the slot set is typically fixed: the number of slots is chosen upfront and slots are only refined. This can lead to multiple slots competing for overlapping regions of the same entity rather than focusing on distinct regions. We introduce slot merging: a drop-in, lightweight operation on the slot set that merges overlapping slots during training. We quantify overlap with a Soft-IoU score between slot-attention maps and combine selected pairs via a barycentric update that preserves gradient flow. Merging follows a fixed policy, with the decision threshold inferred from overlap statistics, requiring no additional learnable modules. Integrated into the established feature-reconstruction pipeline of DINOSAUR, the proposed method improves object factorization and mask quality, surpassing other adaptive methods in object discovery and segmentation benchmarks.
Problem

Research questions and friction points this paper is trying to address.

slot merging
object-centric learning
slot competition
overlapping regions
object factorization
Innovation

Methods, ideas, or system contributions that make the work stand out.

slot merging
object-centric learning
Soft-IoU
barycentric update
DINOSAUR
🔎 Similar Papers
No similar papers found.
C
Christos Chatzisavvas
Department of Electrical and Computer Engineering, Democritus University of Thrace, Greece; Institute for Language and Speech Processing, Athena Research Center, Greece
P
Panagiotis Rigas
Department of Informatics & Telecommunications, National and Kapodistrian University of Athens, Greece; Archimedes, Athena Research Center, Greece
G
George Ioannakis
Institute for Language and Speech Processing, Athena Research Center, Greece
Vassilis Katsouros
Vassilis Katsouros
Athena Research Center - Institute for Language and Speech Processing
Nikolaos Mitianoudis
Nikolaos Mitianoudis
Professor in Audio and Image Processing, Democritus University of Thrace, Greece
Audio Signal ProcessingImage and Video ProcessingDeep LearningStatistical Signal Processing