🤖 AI Summary
Medical image labels inherently exhibit hierarchical structure (e.g., organ → tissue → subtype), yet prevailing self-supervised learning (SSL) methods ignore this hierarchy, yielding semantically inconsistent representations. To address this, we propose the first hierarchy-aware contrastive learning framework compatible with both Euclidean and hyperbolic embeddings—without modifying network architecture. Our method leverages the label taxonomy as explicit supervisory signal and evaluation benchmark. Key contributions are: (1) a Hierarchical Weighted Contrastive (HWC) loss that modulates attraction/repulsion strength between positive/negative pairs based on path weights in the label tree; and (2) a Level-Aware Margin (LAM) mechanism that dynamically adjusts contrastive margins according to semantic granularity. We evaluate using hierarchy-sensitive metrics (e.g., HF1, H-Acc) across multiple medical imaging benchmarks, achieving significant improvements over state-of-the-art methods. Ablation studies confirm the efficacy of each component, demonstrating superior taxonomy alignment and enhanced semantic interpretability.
📝 Abstract
Medical image labels are often organized by taxonomies (e.g., organ - tissue - subtype), yet standard self-supervised learning (SSL) ignores this structure. We present a hierarchy-preserving contrastive framework that makes the label tree a first-class training signal and an evaluation target. Our approach introduces two plug-in objectives: Hierarchy-Weighted Contrastive (HWC), which scales positive/negative pair strengths by shared ancestors to promote within-parent coherence, and Level-Aware Margin (LAM), a prototype margin that separates ancestor groups across levels. The formulation is geometry-agnostic and applies to Euclidean and hyperbolic embeddings without architectural changes. Across several benchmarks, including breast histopathology, the proposed objectives consistently improve representation quality over strong SSL baselines while better respecting the taxonomy. We evaluate with metrics tailored to hierarchy faithfulness: HF1 (hierarchical F1), H-Acc (tree-distance-weighted accuracy), and parent-distance violation rate. We also report top-1 accuracy for completeness. Ablations show that HWC and LAM are effective even without curvature, and combining them yields the most taxonomy-aligned representations. Taken together, these results provide a simple, general recipe for learning medical image representations that respect the label tree and advance both performance and interpretability in hierarchy-rich domains.