Old is Gold: Optimizing Single-threaded Applications with Exgen-Malloc

📅 2025-10-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Modern multithreaded memory allocators incur redundant overhead in single-threaded scenarios due to complex metadata structures and control logic. To address this, we propose Exgen-Malloc—a lock-free, single-thread-optimized allocator. Its core innovation lies in the first adaptation of high-efficiency mechanisms from multithreaded allocators—including hierarchical free-list management, on-demand memory commitment, and object relocation—to a streamlined, single-threaded architecture. Exgen-Malloc employs a centralized heap, a single sorted free list, and lightweight metadata to minimize indirection and metadata bloat. Evaluated on SPEC CPU2017, redis-benchmark, and mimalloc-bench, it achieves average allocation throughput improvements of 1.17×, 1.10×, and 1.93×, respectively, while reducing peak memory footprint by up to 25.2%. These results significantly surpass those of state-of-the-art general-purpose allocators.

Technology Category

Application Category

📝 Abstract
Memory allocators hide beneath nearly every application stack, yet their performance footprint extends far beyond their code size. Even small inefficiencies in the allocators ripple through caches and the rest of the memory hierarchy, collectively imposing what operators often call a "datacenter tax". At hyperscale, even a 1% improvement in allocator efficiency can unlock millions of dollars in savings and measurable reductions in datacenter energy consumption. Modern memory allocators are designed to optimize allocation speed and memory fragmentation in multi-threaded environments, relying on complex metadata and control logic to achieve high performance. However, the overhead introduced by this complexity prompts a reevaluation of allocator design. Notably, such overhead can be avoided in single-threaded scenarios, which continue to be widely used across diverse application domains. In this paper, we introduce Exgen-Malloc, a memory allocator purpose-built for single-threaded applications. By specializing for single-threaded execution, Exgen-Malloc eliminates unnecessary metadata, simplifies the control flow, thereby reducing overhead and improving allocation efficiency. Its core design features include a centralized heap, a single free-block list, and a balanced strategy for memory commitment and relocation. Additionally, Exgen-Malloc incorporates design principles in modern multi-threaded allocators, which do not exist in legacy single-threaded allocators such as dlmalloc. We evaluate Exgen-Malloc on two Intel Xeon platforms. Across both systems, Exgen-Malloc achieves a speedup of 1.17x, 1.10x, and 1.93x over dlmalloc on SPEC CPU2017, redis-benchmark, and mimalloc-bench, respectively. In addition to performance, Exgen-Malloc achieves 6.2%, 0.1%, and 25.2% memory savings over mimalloc on SPEC CPU2017, redis-benchmark, and mimalloc-bench, respectively.
Problem

Research questions and friction points this paper is trying to address.

Optimizing memory allocators for single-threaded application efficiency
Reducing overhead from complex metadata in single-threaded scenarios
Improving allocation speed and reducing memory fragmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Specialized memory allocator for single-threaded applications
Simplifies control flow and eliminates unnecessary metadata overhead
Uses centralized heap and single free-block list design
🔎 Similar Papers
No similar papers found.
R
Ruihao Li
The University of Texas at Austin, USA
L
Lizy K. John
The University of Texas at Austin, USA
Neeraja J. Yadwadkar
Neeraja J. Yadwadkar
Assistant Professor, University of Texas at Austin
Networked SystemsCloud ComputingMachine Learning