🤖 AI Summary
To address the high timestamp overhead and poor scalability of sampling-based dynamic race detection, this paper proposes a lightweight timestamping mechanism tailored for sparse sampling. Our approach introduces (1) “freshness timestamps” that preserve only causal relationships among sampled events; (2) a compact, domain-specific storage structure coupled with an instance-optimal update algorithm; and (3) synergistic integration of vector-clock optimizations, dynamic sampling scheduling, and causal compression inference. We formally prove that the timestamp overhead is asymptotically optimal at *O*(*s* + *t*), where *s* is the number of samples and *t* is the number of threads—matching the theoretical lower bound. Empirical evaluation on real-world benchmarks demonstrates speedups of multiple orders of magnitude over conventional approaches, while strictly guaranteeing soundness. To our knowledge, this is the first work achieving linear scalability of timestamp cost with respect to both sampling count and thread count in sampling-based race detection.
📝 Abstract
Dynamic race detection based on the happens before (HB) partial order has now become the de facto approach to quickly identify data races in multi-threaded software. Most practical implementations for detecting these races use timestamps to infer causality between events and detect races based on these timestamps. Such an algorithm updates timestamps (stored in vector clocks) at every event in the execution, and is known to induce excessive overhead. Random sampling has emerged as a promising algorithmic paradigm to offset this overhead. It offers the promise of making sound race detection scalable. In this work we consider the task of designing an efficient sampling based race detector with low overhead for timestamping when the number of sampled events is much smaller than the total events in an execution. To solve this problem, we propose (1) a new notion of freshness timestamp, (2) a new data structure to store timestamps, and (3) an algorithm that uses a combination of them to reduce the cost of timestamping in sampling based race detection. Further, we prove that our algorithm is close to optimal -- the number of vector clock traversals is bounded by the number of sampled events and number of threads, and further, on any given dynamic execution, the cost of timestamping due to our algorithm is close to the amount of work any timestamping-based algorithm must perform on that execution, that is it is instance optimal. Our evaluation on real world benchmarks demonstrates the effectiveness of our proposed algorithm over prior timestamping algorithms that are agnostic to sampling.