🤖 AI Summary
This work addresses subdomain aggregation over time-series and image data, proposing the first unified modeling framework grounded in category theory. Methodologically, it formalizes classical aggregation operations—including summation, extremum computation, and sliding-window statistics—as bifunctors on double categories, thereby achieving functional abstraction of aggregation semantics across diverse data structures. Integrating functorial semantic modeling with Blelloch’s parallel scan algorithm, the framework derives novel aggregation operators with provably parallel implementations, substantially extending the applicability of the scan paradigm. Key contributions are: (1) the first functional aggregation framework supporting formal verification of parallelizability; (2) systematic definition and generation of previously unformalized subdomain aggregation patterns; and (3) a composable, extensible mathematical foundation for cross-modal data aggregation. The approach bridges abstract categorical semantics with practical parallel computation, enabling rigorous, scalable, and interoperable aggregation across heterogeneous data modalities.
📝 Abstract
Aggregation of time-series or image data over subsets of the domain is a fundamental task in data science. We show that many known aggregation operations can be interpreted as (double) functors on appropriate (double) categories. Such functorial aggregations are amenable to parallel implementation via straightforward extensions of Blelloch's parallel scan algorithm. In addition to providing a unified viewpoint on existing operations, it allows us to propose new aggregation operations for time-series and image data.