🤖 AI Summary
Scientific archives contain vast, heterogeneous data spanning disciplines such as ecology, genomics, and climate science; however, existing methods rely on predefined objectives and thus struggle to support open-ended discovery of unknown patterns. This paper proposes an unsupervised decomposition of vision foundation model representations using sparse autoencoders (SAEs) to enable open-ended scientific feature discovery. Our method requires neither semantic segmentation nor part-level annotations, instead integrating concept-alignment evaluation with label-free contrastive learning to automatically extract semantically coherent, anatomy-level features. We demonstrate—on real-world ecologically annotated images—for the first time that our approach discovers fine-grained, previously unannotated anatomical structures. It also achieves significant improvements over baselines on standard segmentation benchmarks. By departing from conventional validation paradigms, our work establishes a scalable, interpretable framework for genuine, data-driven scientific discovery across diverse domains.
📝 Abstract
Scientific archives now contain hundreds of petabytes of data across genomics, ecology, climate, and molecular biology that could reveal undiscovered patterns if systematically analyzed at scale. Large-scale, weakly-supervised datasets in language and vision have driven the development of foundation models whose internal representations encode structure (patterns, co-occurrences and statistical regularities) beyond their training objectives. Most existing methods extract structure only for pre-specified targets; they excel at confirmation but do not support open-ended discovery of unknown patterns. We ask whether sparse autoencoders (SAEs) can enable open-ended feature discovery from foundation model representations. We evaluate this question in controlled rediscovery studies, where the learned SAE features are tested for alignment with semantic concepts on a standard segmentation benchmark and compared against strong label-free alternatives on concept-alignment metrics. Applied to ecological imagery, the same procedure surfaces fine-grained anatomical structure without access to segmentation or part labels, providing a scientific case study with ground-truth validation. While our experiments focus on vision with an ecology case study, the method is domain-agnostic and applicable to models in other sciences (e.g., proteins, genomics, weather). Our results indicate that sparse decomposition provides a practical instrument for exploring what scientific foundation models have learned, an important prerequisite for moving from confirmation to genuine discovery.