Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices

📅 2025-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) face scalability bottlenecks due to the exhaustion of high-quality public data and the concentration of computational resources among tech giants. Method: This paper proposes a novel decentralized LLM training paradigm leveraging massive edge devices as a distributed AI infrastructure. It systematically integrates federated learning, distributed optimization, edge-native data governance, and lightweight model co-training. Contribution/Results: We first empirically demonstrate the feasibility of training large models using coordinated clusters of resource-constrained edge devices, revealing the synergistic potential of trillion-scale edge compute and heterogeneous private data. Second, we establish a theoretical framework and practical technical pathway for democratizing AI development—significantly lowering entry barriers for non-industrial stakeholders and enabling community-driven LLM research and innovation. Our approach shifts LLM training from centralized, data-hungry paradigms toward privacy-aware, scalable, and inclusive decentralized collaboration.

Technology Category

Application Category

📝 Abstract
The remarkable success of foundation models has been driven by scaling laws, demonstrating that model performance improves predictably with increased training data and model size. However, this scaling trajectory faces two critical challenges: the depletion of high-quality public data, and the prohibitive computational power required for larger models, which have been monopolized by tech giants. These two bottlenecks pose significant obstacles to the further development of AI. In this position paper, we argue that leveraging massive distributed edge devices can break through these barriers. We reveal the vast untapped potential of data and computational resources on massive edge devices, and review recent technical advancements in distributed/federated learning that make this new paradigm viable. Our analysis suggests that by collaborating on edge devices, everyone can participate in training large language models with small edge devices. This paradigm shift towards distributed training on edge has the potential to democratize AI development and foster a more inclusive AI community.
Problem

Research questions and friction points this paper is trying to address.

Address depletion of high-quality public data for AI training.
Overcome prohibitive computational power for large model scaling.
Enable distributed training on edge devices to democratize AI.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverage distributed edge devices for AI training
Utilize federated learning for resource efficiency
Enable large model training on small edge devices
🔎 Similar Papers
No similar papers found.
T
Tao Shen
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Didi Zhu
Didi Zhu
Imperial College London
Multi-Modal LLMsOut of Distribution Generalization
Ziyu Zhao
Ziyu Zhao
University of South Carolina
computer vision. 2D/3D segmentationGenerative 3D reconstruction
C
Chao Wu
School of Public Affairs and Academy of Social Governance, Zhejiang University, Hangzhou, China
F
Fei Wu
College of Computer Science and Technology, Zhejiang University, Hangzhou, China