TinyTorch: Building Machine Learning Systems from First Principles

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the disconnect between algorithmic theory and system implementation in machine learning education by proposing an “implementation-based systems pedagogy.” The approach enables students to build core PyTorch components—including tensors, automatic differentiation, optimizers, CNNs, and Transformers—from scratch in pure Python, using only a 4GB RAM machine without a GPU. It innovatively integrates three teaching strategies: progressive complexity unveiling, systems-first integrative analysis, and build-as-you-validate milestones. For the first time, key breakthroughs across 67 years of ML history are embedded into a hands-on, implementable instructional framework. The resulting lightweight, open-source, and low-hardware-barrier educational system significantly enhances students’ understanding of memory, latency, and deployment trade-offs, effectively cultivating the ML systems engineering competencies urgently needed in industry.

Technology Category

Application Category

📝 Abstract
Machine learning education faces a fundamental gap: students learn algorithms without understanding the systems that execute them. They study gradient descent without measuring memory, attention mechanisms without analyzing O(N^2) scaling, optimizer theory without knowing why Adam requires 3x the memory of SGD. This"algorithm-systems divide"produces practitioners who can train models but cannot debug memory failures, optimize inference latency, or reason about deployment trade-offs--the very skills industry demands as"ML systems engineering."We present TinyTorch, a 20-module curriculum that closes this gap through"implementation-based systems pedagogy": students construct PyTorch's core components (tensors, autograd, optimizers, CNNs, transformers) in pure Python, building a complete framework where every operation they invoke is code they wrote. The design employs three patterns:"progressive disclosure"of complexity,"systems-first integration"of profiling from the first module, and"build-to-validate milestones"recreating 67 years of ML breakthroughs--from Perceptron (1958) through Transformers (2017) to MLPerf-style benchmarking. Requiring only 4GB RAM and no GPU, TinyTorch demonstrates that deep ML systems understanding is achievable without specialized hardware. The curriculum is available open-source at mlsysbook.ai/tinytorch.
Problem

Research questions and friction points this paper is trying to address.

machine learning education
algorithm-systems divide
ML systems engineering
implementation gap
systems understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

implementation-based systems pedagogy
progressive disclosure
systems-first integration
ML systems engineering
from-scratch framework