🤖 AI Summary
This work addresses the lack of theoretical guarantees in existing contrastive learning methods regarding the coverage of semantic feature spaces, which often struggle to simultaneously ensure inclusion of positive samples and exclusion of negative ones. To this end, the authors introduce conformal prediction into contrastive learning and propose a learnable generalized multi-norm constrained minimum-volume enclosing set. By using volume minimization as a surrogate objective for negative-sample exclusion, the method operates effectively without requiring explicit negative pairs while inheriting the distribution-free coverage guarantees of conformal prediction. Experiments demonstrate that the proposed approach significantly outperforms standard distance-based conformal baselines on both synthetic and real-world image datasets, achieving a superior trade-off between positive-sample coverage and negative-sample exclusion.
📝 Abstract
Contrastive learning produces coherent semantic feature embeddings by encouraging positive samples to cluster closely while separating negative samples. However, existing contrastive learning methods lack principled guarantees on coverage within the semantic feature space. We extend conformal prediction to this setting by introducing minimum-volume covering sets equipped with learnable generalized multi-norm constraints. We propose a method that constructs conformal sets guaranteeing user-specified coverage of positive samples while maximizing negative sample exclusion. We establish theoretically that volume minimization serves as a proxy for negative exclusion, enabling our approach to operate effectively even when negative pairs are unavailable. The positive inclusion guarantee inherits the distribution-free coverage property of conformal prediction, while negative exclusion is maximized through learned set geometry optimized on a held-out training split. Experiments on simulated and real-world image datasets demonstrate improved inclusion-exclusion trade-offs compared to standard distance-based conformal baselines.