Aggregating Funnels for Faster Fetch&Add and Queues

πŸ“… 2024-11-21
πŸ›οΈ ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address severe contention and scalability bottlenecks caused by fetch-and-add operations on a single memory location under high concurrency, this paper proposes Aggregating Funnelsβ€”a novel mechanism that distributes atomic operations across multiple memory locations to enable cross-location batch aggregation and decoupled result computation. Our approach leverages dual-location coordinated batching, lock-free concurrency control, and fine-grained memory layout optimization, building an efficient aggregation path directly atop hardware-supported fetch-and-add instructions. Unlike conventional single-point or combining funnels, Aggregating Funnels overcomes fundamental scalability limits inherent in prior designs. Experimental evaluation demonstrates significantly higher throughput compared to state-of-the-art Combining Funnels. When integrated into mainstream concurrent queues, it delivers substantial end-to-end performance improvements by eliminating critical serialization bottlenecks.

Technology Category

Application Category

πŸ“ Abstract
Many concurrent algorithms require processes to perform fetch-and-add operations on a single memory location, which can be a hot spot of contention. We present a novel algorithm called Aggregating Funnels that reduces this contention by spreading the fetch-and-add operations across multiple memory locations. It aggregates fetch-and-add operations into batches so that the batch can be performed by a single hardware fetch-and-add instruction on one location and all operations in the batch can efficiently compute their results by performing a fetch-and-add instruction on a different location. We show experimentally that this approach achieves higher throughput than previous combining techniques, such as Combining Funnels, and is substantially more scalable than applying hardware fetch-and-add instructions on a single memory location. We show that replacing the fetch-and-add instructions in the fastest state-of-the-art concurrent queue by our Aggregating Funnels eliminates a bottleneck and greatly improves the queue's overall throughput.
Problem

Research questions and friction points this paper is trying to address.

Reduces contention in fetch-and-add operations
Improves throughput in concurrent algorithms
Enhances scalability of concurrent queues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aggregates fetch-and-add operations into batches
Spreads operations across multiple memory locations
Enhances throughput and scalability of concurrent queues
πŸ”Ž Similar Papers
No similar papers found.