ORGAN: Object-Centric Representation Learning using Cycle Consistent Generative Adversarial Networks

📅 2026-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of extracting object-centric structured representations from complex real-world scenes—characterized by multiple objects and low contrast—under unsupervised conditions. We propose a novel approach based on Cycle-Consistent Generative Adversarial Networks (Cycle-Consistent GANs), which, for the first time, introduces cycle consistency into object-centric representation learning, thereby overcoming limitations of conventional autoencoder architectures. By jointly performing unsupervised object segmentation and modeling a low-dimensional latent space, our method decomposes input images into independent object representations and reconstructs them faithfully. Experiments demonstrate that the proposed method achieves state-of-the-art performance on synthetic data and is currently the only approach capable of effectively handling multi-object, low-contrast real-world images. The learned representations enable object-level manipulation and exhibit strong scalability with respect to both object count and image resolution.

Technology Category

Application Category

📝 Abstract
Although data generation is often straightforward, extracting information from data is more difficult. Object-centric representation learning can extract information from images in an unsupervised manner. It does so by segmenting an image into its subcomponents: the objects. Each object is then represented in a low-dimensional latent space that can be used for downstream processing. Object-centric representation learning is dominated by autoencoder architectures (AEs). Here, we present ORGAN, a novel approach for object-centric representation learning, which is based on cycle-consistent Generative Adversarial Networks instead. We show that it performs similarly to other state-of-the-art approaches on synthetic datasets, while at the same time being the only approach tested here capable of handling more challenging real-world datasets with many objects and low visual contrast. Complementing these results, ORGAN creates expressive latent space representations that allow for object manipulation. Finally, we show that ORGAN scales well both with respect to the number of objects and the size of the images, giving it a unique edge over current state-of-the-art approaches.
Problem

Research questions and friction points this paper is trying to address.

object-centric representation learning
unsupervised learning
real-world datasets
low visual contrast
multi-object scenes
Innovation

Methods, ideas, or system contributions that make the work stand out.

object-centric representation learning
cycle-consistent GANs
unsupervised segmentation
scalable latent representation
real-world image processing
🔎 Similar Papers
No similar papers found.
J
Joël Küchler
Laboratory of Biosensors and Bioelectronics, Institute for Biomedical Engineering, University and ETH Zurich, Gloriastrasse 37/39, Zurich, 8092, Switzerland
E
Ellen van Maren
Department of Neurology, Insel Gruppe, Bern, Switzerland
V
Vaiva Vasiliauskaitė
Laboratory of Biosensors and Bioelectronics, Institute for Biomedical Engineering, University and ETH Zurich, Gloriastrasse 37/39, Zurich, 8092, Switzerland
K
Katarina Vulić
Laboratory of Biosensors and Bioelectronics, Institute for Biomedical Engineering, University and ETH Zurich, Gloriastrasse 37/39, Zurich, 8092, Switzerland
Reza Abbasi-Asl
Reza Abbasi-Asl
Associate Professor of Neurology and Bioengineering | UCSF
Machine LearningComputational NeuroscienceApplied Statistics
S
Stephan J. Ihle
Department of Neurobiology, University of Chicago, 951 E 58th St, Chicago, 60637, IL, USA; Department of Physics, University of Chicago, 929 E 57th St, Chicago, 60637, IL, USA