LatentDiff: Scaling Semantic Dataset Comparison to Millions of Images

📅 2026-04-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

212K/year
🤖 AI Summary
This work addresses the challenge of efficiently detecting extremely subtle semantic shifts—occurring at proportions below 1%—in large-scale image datasets. To this end, the authors propose LatentDiff, a novel framework that operates in the latent space of pretrained vision encoders by integrating sparse autoencoders with density ratio estimation to achieve highly sensitive and interpretable semantic discrepancy detection. LatentDiff is the first to incorporate sparse autoencoders into latent-space divergence analysis and introduces Noisy-Diff, a new benchmark designed to better reflect real-world conditions. Experimental results demonstrate that LatentDiff significantly outperforms existing image-caption-based methods under ultra-low shift ratios while substantially reducing computational overhead.
📝 Abstract
We present LatentDiff, a scalable framework for semantic dataset comparison that operates directly in the latent space of pretrained vision encoders. By combining sparse autoencoder-based divergence testing with density ratio estimation, LatentDiff identifies interpretable semantic differences between datasets at a fraction of the computational cost of caption-based alternatives. We also introduce Noisy-Diff, a benchmark capturing realistic sparse distribution shifts that cause existing methods to struggle. Experiments demonstrate that LatentDiff achieves superior accuracy while remaining robust to settings where an extremely small fraction of images (from 5% to <1% ) differ semantically.
Problem

Research questions and friction points this paper is trying to address.

semantic dataset comparison
distribution shift
latent space
scalability
sparse differences
Innovation

Methods, ideas, or system contributions that make the work stand out.

LatentDiff
semantic dataset comparison
latent space
sparse autoencoder
density ratio estimation
🔎 Similar Papers
2024-07-08Asian Conference on Computer VisionCitations: 2