Stdlib or Third-Party? Empirical Performance and Correctness of LLM-Assisted Zero-Dependency Python Libraries

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
Third-party Python libraries often impose significant burdens in dependency management, supply chain risks, and deployment complexity. This work introduces the zerodep project, which leverages large language models under strict constraints to reimplement over forty popular libraries as single-file, zero-dependency, API-compatible alternatives using only the Python standard library. This study presents the first large-scale empirical analysis of the expressive and functional boundaries of the Python standard library and systematically evaluates the capability of LLMs to generate high-performance, correct code under stringent constraints. Experimental results show that most reimplementations achieve performance within a factor of two of the original libraries, with certain scenarios demonstrating speedups of 5× to 115×. The primary performance bottlenecks stem from the absence of C extensions rather than inherent inefficiencies of pure Python, revealing architectural redundancies in several widely used libraries that can be effectively avoided.
📝 Abstract
Third-party Python libraries introduce dependency management overhead, supply chain risk, and deployment friction in constrained environments. A natural question is how much of this ecosystem can be replicated using only Python's standard library -- and at what correctness and performance cost. We address this empirically through zerodep, a growing collection of single-file Python modules, each a stdlib-only reimplementation of a popular third-party library, developed with LLM assistance under strict constraints: no external imports, single file, drop-in API compatibility, and mandatory correctness validation against the reference library. Spanning over 40 modules across 12 categories -- including serialization, networking, cryptography, agent protocols, and text processing -- zerodep provides a controlled testbed for two interrelated questions: (1) Where does the stdlib suffice? and (2) Can LLMs effectively generate correct, performant code under tight symbolic constraints? Systematic benchmarking shows that stdlib-only implementations achieve performance parity (within 2x of the reference) in the majority of cases. The primary performance cliff is C-extension-backed computation (image processing, binary serialization, low-level crypto), not the inherent overhead of pure-Python third-party libraries. Conversely, many widely-used libraries carry architectural overhead that LLM-generated stdlib reimplementations avoid, yielding 5--115x speedups in several categories. We characterize the stdlib capability boundary across complexity tiers and library categories, discuss where LLM-assisted development succeeds and where it requires iterative human correction, and examine implications for dependency-free software engineering at scale. zerodep is open-source at https://github.com/Oaklight/zerodep.
Problem

Research questions and friction points this paper is trying to address.

stdlib
third-party libraries
dependency management
performance
correctness
Innovation

Methods, ideas, or system contributions that make the work stand out.

zero-dependency
stdlib-only
LLM-assisted code generation
empirical benchmarking
drop-in compatibility
🔎 Similar Papers
No similar papers found.