Efficient Large Foundation Models Design: A Perspective From Model and System Co-Design

📅 2024-09-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from prohibitively high computational costs, resource consumption, and deployment expenses in both training and inference. To address this, this work proposes the first holistic model-system co-design paradigm for efficient LLMs, unifying key optimization avenues—including sparsification, quantization, attention mechanism optimization, memory-aware scheduling, compiler-level acceleration, and hardware adaptation—into a coherent framework. Through cross-layer joint optimization, we establish a comprehensive, full-stack technical taxonomy spanning training and inference, and publicly release a structured, open-source knowledge base. Beyond systematically categorizing and evaluating state-of-the-art efficient LLM techniques, our contribution provides a reusable methodology framework and practical implementation guidelines. This significantly improves model efficiency, affordability, and accessibility, establishing a standardized research infrastructure for both academia and industry.

Technology Category

Application Category

📝 Abstract
This paper focuses on modern efficient training and inference technologies on foundation models and illustrates them from two perspectives: model and system design. Model and System Design optimize LLM training and inference from different aspects to save computational resources, making LLMs more efficient, affordable, and more accessible. The paper list repository is available at url{https://github.com/NoakLiu/Efficient-Foundation-Models-Survey}
Problem

Research questions and friction points this paper is trying to address.

Efficient Large Models
Simplified Modeling
Resource-saving Training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient Method
Large Language Model
Resource Saving
🔎 Similar Papers
No similar papers found.