Efficient Large Foundation Models Design: A Perspective From Model and System Co-Design

📅 2024-09-03

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Large language models (LLMs) suffer from prohibitively high computational costs, resource consumption, and deployment expenses in both training and inference. To address this, this work proposes the first holistic model-system co-design paradigm for efficient LLMs, unifying key optimization avenues—including sparsification, quantization, attention mechanism optimization, memory-aware scheduling, compiler-level acceleration, and hardware adaptation—into a coherent framework. Through cross-layer joint optimization, we establish a comprehensive, full-stack technical taxonomy spanning training and inference, and publicly release a structured, open-source knowledge base. Beyond systematically categorizing and evaluating state-of-the-art efficient LLM techniques, our contribution provides a reusable methodology framework and practical implementation guidelines. This significantly improves model efficiency, affordability, and accessibility, establishing a standardized research infrastructure for both academia and industry.

Technology Category

Application Category

📝 Abstract

This paper focuses on modern efficient training and inference technologies on foundation models and illustrates them from two perspectives: model and system design. Model and System Design optimize LLM training and inference from different aspects to save computational resources, making LLMs more efficient, affordable, and more accessible. The paper list repository is available at url{https://github.com/NoakLiu/Efficient-Foundation-Models-Survey}

Problem

Research questions and friction points this paper is trying to address.

Efficient Large Models

Simplified Modeling

Resource-saving Training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient Method

Large Language Model

Resource Saving

🔎 Similar Papers

No similar papers found.