🤖 AI Summary
This study investigates the dynamic evolution of power structures within the open-weight AI model ecosystem, focusing on decentralization trends on the Hugging Face Model Hub from 2019 to 2025. Methodologically, it leverages longitudinal data encompassing 851,000 open-source models and 2.2 billion downloads, enriched with 200+ metadata dimensions, and employs statistical modeling coupled with interactive visualization. Key contributions include: (1) empirical evidence of declining dominance by U.S. tech giants and concurrent rise of independent developers, community-driven organizations, and Chinese models (e.g., Qwen, DeepSeek); (2) quantification of average model size growth (17×), alongside accelerated adoption of multimodal architectures (3.4×), Mixture-of-Experts (7×), and quantized models (5×); (3) first demonstration that open-weight models now surpass fully open-source models in prevalence, and identification of a novel intermediary developer cohort specializing in model adaptation and quantization; and (4) release of the first large-scale, open dataset characterizing the model ecosystem, along with a conceptual framework for power rebalancing in open intelligence economies.
📝 Abstract
Since 2019, the Hugging Face Model Hub has been the primary global platform for sharing open weight AI models. By releasing a dataset of the complete history of weekly model downloads (June 2020-August 2025) alongside model metadata, we provide the most rigorous examination to-date of concentration dynamics and evolving characteristics in the open model economy. Our analysis spans 851,000 models, over 200 aggregated attributes per model, and 2.2B downloads. We document a fundamental rebalancing of economic power: US open-weight industry dominance by Google, Meta, and OpenAI has declined sharply in favor of unaffiliated developers, community organizations, and, as of 2025, Chinese industry, with DeepSeek and Qwen models potentially heralding a new consolidation of market power. We identify statistically significant shifts in model properties, a 17X increase in average model size, rapid growth in multimodal generation (3.4X), quantization (5X), and mixture-of-experts architectures (7X), alongside concerning declines in data transparency, with open weights models surpassing truly open source models for the first time in 2025. We expose a new layer of developer intermediaries that has emerged, focused on quantizing and adapting base models for both efficiency and artistic expression. To enable continued research and oversight, we release the complete dataset with an interactive dashboard for real-time monitoring of concentration dynamics and evolving properties in the open model economy.