Misleading Microbenchmarks on the Java Virtual Machines

📅 2026-05-22

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This study addresses the misleading nature of Java microbenchmarks conducted in isolation, which often yield distorted performance profiles due to the JVM’s dynamic compilation mechanisms—particularly inaccurate runtime profiling data such as branch probabilities and call-site types. The paper presents the first systematic investigation into profiling biases introduced by the absence of realistic contextual information in microbenchmarks, demonstrating that such distortions persist even when benchmarks strictly adhere to JMH best practices. By integrating JMH with JVM dynamic compilation internals and runtime profiling techniques, the authors empirically analyze representative cases of benchmark misinterpretation and propose an enhanced set of practical guidelines. These recommendations substantially improve the representativeness and reliability of microbenchmark results with respect to real-world application performance.

📝 Abstract

Developers often use microbenchmarks to choose the most performant implementation of a method or a class. On the Java Virtual Machine (JVM), this is commonly done using the Java Microbenchmark Harness (JMH) which addresses common pitfalls of measuring code performance on the JVM. However, even using JMH guidelines cannot overcome the fundamental issue of context. Microbenchmarks inherently execute code in isolation, without interference from other application code competing for CPU resources, such as cache or branch-predictor capacity. On managed runtimes with tiered dynamic compilation, such as the JVM, the speculative, profile-driven nature of compilation decisions means that code performance is highly dependent on profiles collected during early execution. Because profiles usually include also branch probabilities and receiver types (besides code hotness metrics), a badly designed microbenchmark may cause the JVM to collect an unrealistic profile, resulting in aggressive, yet misleading, optimizations, that would not occur in a real application. In this paper, we demonstrate how using microbenchmarks under conditions that induce the JVM to collect unrealistic profiles yields misleading results despite following existing guidelines. We also extend these guidelines by suggesting actions to make the microbenchmark results more representative.

Problem

Research questions and friction points this paper is trying to address.

microbenchmarks

Java Virtual Machine

dynamic compilation

performance profiling

misleading optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

microbenchmarks

JVM

dynamic compilation