Charting and Navigating Hugging Face's Model Atlas

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of systematic navigation and analysis posed by the lack of documentation across Hugging Face’s million-scale model repository, this paper introduces the first navigable geographical atlas for large language models. Methodologically, we integrate metadata mining, multidimensional embedding visualization, and structure-aware prior modeling—where the prior is derived from real-world training practices and enables high-confidence inference of undocumented model attributes (e.g., task type, accuracy). We further propose an active mapping strategy to systematically fill atlas gaps and support trend analysis and interactive exploration. Key contributions include: (1) the first open-source Hugging Face Model Atlas; (2) release of the complete dataset, source code, and an interactive web platform; and (3) automated model attribute prediction and quantitative analysis of domain evolution—establishing a novel paradigm for systematic governance and discovery in large-scale model repositories.

Technology Category

Application Category

📝 Abstract
As there are now millions of publicly available neural networks, searching and analyzing large model repositories becomes increasingly important. Navigating so many models requires an atlas, but as most models are poorly documented charting such an atlas is challenging. To explore the hidden potential of model repositories, we chart a preliminary atlas representing the documented fraction of Hugging Face. It provides stunning visualizations of the model landscape and evolution. We demonstrate several applications of this atlas including predicting model attributes (e.g., accuracy), and analyzing trends in computer vision models. However, as the current atlas remains incomplete, we propose a method for charting undocumented regions. Specifically, we identify high-confidence structural priors based on dominant real-world model training practices. Leveraging these priors, our approach enables accurate mapping of previously undocumented areas of the atlas. We publicly release our datasets, code, and interactive atlas.
Problem

Research questions and friction points this paper is trying to address.

Navigating and analyzing large neural network repositories
Charting an atlas for poorly documented models
Mapping undocumented regions using structural priors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops visual atlas for Hugging Face models
Predicts model attributes using structural priors
Maps undocumented regions with high-confidence techniques
🔎 Similar Papers
No similar papers found.
E
Eliahu Horwitz
School of Computer Science and Engineering, The Hebrew University of Jerusalem, Israel
N
Nitzan Kurer
School of Computer Science and Engineering, The Hebrew University of Jerusalem, Israel
J
Jonathan Kahana
School of Computer Science and Engineering, The Hebrew University of Jerusalem, Israel
L
Liel Amar
School of Computer Science and Engineering, The Hebrew University of Jerusalem, Israel
Yedid Hoshen
Yedid Hoshen
The Hebrew University of Jerusalem
Deep LearningAIComputer Vision