🤖 AI Summary
Task-based and actor-based programming models face a fundamental trade-off between developer productivity and runtime performance, hindering their joint adoption in distributed heterogeneous systems.
Method: We establish a formal duality between the two models and propose a unified modeling framework with low-overhead scheduling and communication optimizations. Our approach integrates explicit and implicit parallelism within the Realm/Legion task runtime, enabling fine-grained dependency-aware scheduling, zero-copy inter-node communication, and lightweight task migration.
Contribution/Results: Experiments show that Realm reduces runtime overhead by 1.7–5.3× and improves strong scaling by 1.3–5.0×, achieving end-to-end performance competitive with mature actor systems (e.g., Charm++ and MPI). This work is the first to rigorously formalize and empirically validate the duality of task and actor models—both theoretically and in practice—thereby establishing a foundation for high-productivity, high-performance programming paradigms in distributed heterogeneous environments.
📝 Abstract
Programming models for distributed and heterogeneous machines are rapidly growing in popularity to meet the demands of modern workloads. Task and actor models are common choices that offer different trade-offs between development productivity and achieved performance. Task-based models offer better productivity and composition of software, whereas actor-based models routinely deliver better peak performance due to lower overheads. While task-based and actor-based models appear to be different superficially, we demonstrate these programming models are duals of each other. Importantly, we show that this duality extends beyond functionality to performance, and elucidate techniques that let task-based systems deliver performance competitive with actor-based systems without compromising productivity. We apply these techniques to both Realm, an explicitly parallel task-based runtime, as well as Legion, an implicitly parallel task-based runtime. We show these techniques reduce Realm's overheads by between 1.7-5.3x, coming within a factor of two of the overheads imposed by heavily optimized actor-based systems like Charm++ and MPI. We further show that our techniques enable between 1.3-5.0x improved strong scaling of unmodified Legion applications.