Compiling Away the Overhead of Race Detection

📅 2025-12-05

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Dynamic data-race detection tools (e.g., ThreadSanitizer) suffer from high runtime overhead due to pervasive instrumentation, severely limiting practical deployment. This paper proposes the first compiler-level, systematic approach to eliminate redundant instrumentation—implemented in LLVM via interprocedural static analysis. Our method jointly models memory access patterns, synchronization operations, and thread creation; it leverages equivalence-class representatives and dominance relations to precisely identify memory accesses guaranteed to be race-free, thereby removing unnecessary race-detection probes. Crucially, the technique preserves full detection completeness and alarm precision. Evaluated on real-world applications, it achieves a 1.34× geometric mean speedup over baseline ThreadSanitizer, with up to 2.5× acceleration in highly concurrent workloads. Compilation overhead is negligible, and the entire process is fully automated. The implementation has been officially accepted and merged into the mainline ThreadSanitizer distribution.

Technology Category

Application Category

📝 Abstract

Dynamic data race detectors are indispensable for flagging concurrency errors in software, but their high runtime overhead limits their adoption. This overhead stems primarily from pervasive instrumentation of memory accesses - a significant fraction of which is redundant. We addresses this inefficiency through a static, compiler-integrated approach that identifies and eliminates redundant instrumentation, drastically reducing the runtime cost of dynamic data race detectors. We introduce a suite of interprocedural static analyses reasoning about memory access patterns, synchronization, and thread creation to eliminate instrumentation for provably race-free accesses and show that the completeness properties of the data race detector are preserved. We further observe that many inserted checks flag a race if and only if a preceding check has already flagged an equivalent race for the same memory location - albeit potentially at a different access. We characterize this notion of equivalence and show that, when limiting reporting to at least one representative for each equivalence class, a further class of redundant checks can be eliminated. We identify such accesses using a novel dominance-based elimination analysis. Based on these two insights, we have implemented five static analyses within the LLVM, integrated with the instrumentation pass of the race detector ThreadSanitizer. Our experimental evaluation on a diverse suite of real-world applications demonstrates that our approach significantly reduces race detection overhead, achieving a geomean speedup of 1.34x, with peak speedups reaching 2.5x under high thread contention. This performance is achieved with a negligible increase in compilation time and, being fully automatic, places no additional burden on developers. Our optimizations have been accepted by the ThreadSanitizer maintainers and are in the process of being upstreamed.

Problem

Research questions and friction points this paper is trying to address.

Reduces runtime overhead of dynamic data race detectors

Eliminates redundant instrumentation via static compiler analyses

Preserves completeness while speeding up race detection significantly

Innovation

Methods, ideas, or system contributions that make the work stand out.

Compiler-integrated static analysis eliminates redundant instrumentation

Dominance-based elimination removes checks for equivalent race conditions

Preserves detector completeness while drastically reducing runtime overhead

🔎 Similar Papers

Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora