🤖 AI Summary
This work addresses the challenge in deep clustering where representation learning and clustering tasks are typically decoupled due to the discrete nature of clustering optimization, hindering end-to-end training. To bridge this gap, the authors propose a novel energy-based loss function that, for the first time, integrates an associative memory mechanism into a deep clustering framework. By leveraging an energy model, the method tightly couples continuous representation learning with discrete clustering, enabling joint end-to-end optimization. The approach is highly flexible, compatible with diverse network architectures—including convolutional, residual, and fully connected networks—and applicable across modalities such as images and text. Extensive experiments on multiple benchmarks demonstrate significant improvements in clustering performance, confirming the method’s effectiveness and generalizability.
📝 Abstract
Deep clustering - joint representation learning and latent space clustering - is a well studied problem especially in computer vision and text processing under the deep learning framework. While the representation learning is generally differentiable, clustering is an inherently discrete optimization task, requiring various approximations and regularizations to fit in a standard differentiable pipeline. This leads to a somewhat disjointed representation learning and clustering. In this work, we propose a novel loss function utilizing energy-based dynamics via Associative Memories to formulate a new deep clustering method, DCAM, which ties together the representation learning and clustering aspects more intricately in a single objective. Our experiments showcase the advantage of DCAM, producing improved clustering quality for various architecture choices (convolutional, residual or fully-connected) and data modalities (images or text).