SynDaCaTE: A Synthetic Dataset For Evaluating Part-Whole Hierarchical Inference

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of interpretability verification regarding part-whole hierarchical reasoning in capsule networks. To quantitatively evaluate compositional generalization over part-whole structures, we introduce SynDaCaTE—the first controllable synthetic benchmark dataset specifically designed for this purpose. We propose a diagnostic evaluation paradigm tailored to hierarchical reasoning, systematically uncovering critical bottlenecks in structural generalization across mainstream capsule models. Our analysis yields the first empirical evidence that permutation-equivariant self-attention significantly enhances part-to-whole inference efficacy. On SynDaCaTE, our method achieves near-perfect (≈100%) accuracy in part-whole relationship identification—substantially outperforming both standard CNNs and the original CapsuleNet. Moreover, it precisely identifies and rectifies the compositional reasoning failure points inherent in CapsuleNet, thereby advancing both interpretability and generalization capability in capsule-based architectures.

Technology Category

Application Category

📝 Abstract
Learning to infer object representations, and in particular part-whole hierarchies, has been the focus of extensive research in computer vision, in pursuit of improving data efficiency, systematic generalisation, and robustness. Models which are emph{designed} to infer part-whole hierarchies, often referred to as capsule networks, are typically trained end-to-end on supervised tasks such as object classification, in which case it is difficult to evaluate whether such a model emph{actually} learns to infer part-whole hierarchies, as claimed. To address this difficulty, we present a SYNthetic DAtaset for CApsule Testing and Evaluation, abbreviated as SynDaCaTE, and establish its utility by (1) demonstrating the precise bottleneck in a prominent existing capsule model, and (2) demonstrating that permutation-equivariant self-attention is highly effective for parts-to-wholes inference, which motivates future directions for designing effective inductive biases for computer vision.
Problem

Research questions and friction points this paper is trying to address.

Evaluating part-whole hierarchy learning in models
Identifying limitations in existing capsule networks
Exploring self-attention for parts-to-wholes inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic dataset for capsule model evaluation
Identifies bottleneck in existing capsule networks
Proposes permutation-equivariant self-attention for inference
🔎 Similar Papers
No similar papers found.