🤖 AI Summary
Deep learning models struggle with object binding in multi-object visual scenes, failing to accurately distinguish and integrate features belonging to the same object. To address this, we propose a neuroscience-inspired, synchrony-driven object binding mechanism: for the first time, we embed Kuramoto phase synchronization dynamics into deep neural networks, leveraging complex-valued weights and learnable phase encodings to achieve differentiable, dynamic, object-level feature clustering via feature-phase alignment. Our method supports both feedforward and feedback loop architectures and introduces a phase-driven feature grouping module. Evaluated on multi-object classification tasks—including overlapping handwritten digits, noisy images, and out-of-distribution geometric transformations—our model significantly outperforms real-valued baselines and ablated complex-valued models without synchronization. Results demonstrate that phase synchronization substantially enhances robustness and generalization, validating its functional role in perceptual grouping.
📝 Abstract
Neural synchrony is hypothesized to play a crucial role in how the brain organizes visual scenes into structured representations, enabling the robust encoding of multiple objects within a scene. However, current deep learning models often struggle with object binding, limiting their ability to represent multiple objects effectively. Inspired by neuroscience, we investigate whether synchrony-based mechanisms can enhance object encoding in artificial models trained for visual categorization. Specifically, we combine complex-valued representations with Kuramoto dynamics to promote phase alignment, facilitating the grouping of features belonging to the same object. We evaluate two architectures employing synchrony: a feedforward model and a recurrent model with feedback connections to refine phase synchronization using top-down information. Both models outperform their real-valued counterparts and complex-valued models without Kuramoto synchronization on tasks involving multi-object images, such as overlapping handwritten digits, noisy inputs, and out-of-distribution transformations. Our findings highlight the potential of synchrony-driven mechanisms to enhance deep learning models, improving their performance, robustness, and generalization in complex visual categorization tasks.