TopoGeoScore: A Self-Supervised Source-Only Geometric Framework for OOD Checkpoint Selection

📅 2026-05-09

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses the challenge of evaluating out-of-distribution (OOD) robustness in the absence of target-domain labels, particularly the lack of effective methods to select robust model checkpoints using only source-domain information. The authors propose TopoGeoScore, a self-supervised geometric scoring framework that uniquely integrates high-order topological features—including twist-inspired graph Laplacian determinants, Ollivier–Ricci curvature, and persistent homology—to capture global manifold complexity, local regularity, and topological structure via class-conditional mutual k-nearest neighbor graphs. A self-supervised contrastive learning mechanism automatically weights these signals to produce an interpretable robustness score without requiring any target-domain data. Extensive experiments on benchmarks such as CIFAR corruptions, ImageNet-C, MNLI→HANS, and OGBN-Arxiv demonstrate that source-domain representations encode reliable geometric-topological signals predictive of OOD robustness, significantly improving pre-deployment checkpoint selection.

📝 Abstract

Out-of-distribution (OOD) robustness is difficult to diagnose when target-domain labels are unavailable. We consider a more restrictive source-only variant of unsupervised accuracy estimation: selecting robust checkpoints using only source-domain representations, with no target samples or target labels. We propose \textbf{TopoGeoScore}, a source-only geometric scorer for label-free OOD checkpoint selection. Given a trained checkpoint, we construct class-conditional mutual $k$-nearest-neighbour graphs from source embeddings and extract three interpretable signals: a torsion-inspired reduced Laplacian log-determinant for global class-manifold complexity, Ollivier--Ricci curvature for local neighbourhood regularity, and higher-order topological summaries for fragmented connectivity, loops, and global--local inconsistency. Instead of fixing their weights by hand, TopoGeoScore learns a non-negative linear score through a self-supervised objective that enforces invariance under approximately geometry-preserving embedding views and separation from structure-breaking views. The score remains interpretable and uses no target-domain samples or labels. Results across CIFAR-based corruption and distribution-shift benchmarks, ImageNet-C, MNLI$\to$HANS transfer, and OGBN-Arxiv suggest that source representations contain measurable global--local--topological evidence of robustness, supporting practical checkpoint selection before deployment under distribution shift.

Problem

Research questions and friction points this paper is trying to address.

out-of-distribution robustness

checkpoint selection

source-only

unsupervised accuracy estimation

distribution shift

Innovation

Methods, ideas, or system contributions that make the work stand out.

self-supervised

source-only

geometric scoring