MapReplay: Trace-Driven Benchmark Generation for Java HashMap

📅 2026-03-14

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the longstanding challenge in evaluating Java HashMap performance, where microbenchmarks are overly simplified and application benchmarks suffer from high noise and overhead, impeding accurate assessment of optimizations. To bridge this gap, we propose MapReplay, a novel methodology that dynamically traces HashMap API invocation sequences from real applications and constructs lightweight, replayable benchmarks that faithfully reproduce realistic usage patterns while preserving internal state consistency. By integrating the realism of application benchmarks with the efficiency of microbenchmarks, MapReplay enables, for the first time, precise performance evaluation of HashMap under authentic operation sequences. The resulting MapReplayBench suite successfully replicates performance trends observed in DaCapo-Chopin and Renaissance, drastically reduces experimental runtime, and uncovers fine-grained performance characteristics otherwise obscured in full-application benchmarks.

Technology Category

Application Category

📝 Abstract

Hash-based maps, particularly java.util.HashMap, are pervasive in Java applications and the JVM, making their performance critical. Evaluating optimizations is challenging because performance depends on factors such as operation patterns, key distributions, and resizing behavior. Microbenchmarks are fast and repeatable but often oversimplify workloads, failing to capture the realistic usage patterns. Application benchmarks (e.g., DaCapo, Renaissance) provide realistic usages but are more expensive to run, prone to variability, and dominated by non-HashMap computations, making map-related performance changes difficult to observe. To address this challenge, we propose MapReplay, a benchmarking methodology that combines the realism of application benchmarks with the efficiency of microbenchmarks. MapReplay traces HashMap API usages generating a replay workload that reproduces the same operation sequence while faithfully reconstructing internal map states. This enables realistic and efficient evaluation of alternative implementations under realistic usage patterns. Applying MapReplay to DaCapo-Chopin and Renaissance, the resulting suite, MapReplayBench, reproduces application-level performance trends while reducing experimentation time and revealing insights difficult to obtain from full benchmarks.

Problem

Research questions and friction points this paper is trying to address.

HashMap

benchmarking

performance evaluation

trace-driven

Java

Innovation

Methods, ideas, or system contributions that make the work stand out.

trace-driven benchmarking

HashMap performance

workload replay