TinyTorch: Building Machine Learning Systems from First Principles

📅 2026-01-27

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This work addresses the disconnect between algorithmic theory and system implementation in machine learning education by proposing an “implementation-based systems pedagogy.” The approach enables students to build core PyTorch components—including tensors, automatic differentiation, optimizers, CNNs, and Transformers—from scratch in pure Python, using only a 4GB RAM machine without a GPU. It innovatively integrates three teaching strategies: progressive complexity unveiling, systems-first integrative analysis, and build-as-you-validate milestones. For the first time, key breakthroughs across 67 years of ML history are embedded into a hands-on, implementable instructional framework. The resulting lightweight, open-source, and low-hardware-barrier educational system significantly enhances students’ understanding of memory, latency, and deployment trade-offs, effectively cultivating the ML systems engineering competencies urgently needed in industry.

Technology Category

Application Category

📝 Abstract

Machine learning education faces a fundamental gap: students learn algorithms without understanding the systems that execute them. They study gradient descent without measuring memory, attention mechanisms without analyzing O(N^2) scaling, optimizer theory without knowing why Adam requires 3x the memory of SGD. This"algorithm-systems divide"produces practitioners who can train models but cannot debug memory failures, optimize inference latency, or reason about deployment trade-offs--the very skills industry demands as"ML systems engineering."We present TinyTorch, a 20-module curriculum that closes this gap through"implementation-based systems pedagogy": students construct PyTorch's core components (tensors, autograd, optimizers, CNNs, transformers) in pure Python, building a complete framework where every operation they invoke is code they wrote. The design employs three patterns:"progressive disclosure"of complexity,"systems-first integration"of profiling from the first module, and"build-to-validate milestones"recreating 67 years of ML breakthroughs--from Perceptron (1958) through Transformers (2017) to MLPerf-style benchmarking. Requiring only 4GB RAM and no GPU, TinyTorch demonstrates that deep ML systems understanding is achievable without specialized hardware. The curriculum is available open-source at mlsysbook.ai/tinytorch.

Problem

Research questions and friction points this paper is trying to address.

machine learning education

algorithm-systems divide

ML systems engineering

implementation gap

systems understanding

Innovation

Methods, ideas, or system contributions that make the work stand out.

implementation-based systems pedagogy

progressive disclosure

systems-first integration