The General Expiration Streaming Model: Diameter, $k$-Center, Counting, Sampling, and Friends

📅 2025-09-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the realistic setting where elements in a data stream possess individual, non-sequential expiration times. We propose a *generalized expiry stream model* that unifies classic models (e.g., sliding windows) and supports arbitrary time-decay constraints. To efficiently support approximate counting, uniform and weighted sampling, and metric-space problems—specifically diameter estimation and *k*-center clustering—we design a *joint time–space filtering framework*. This framework integrates dynamic tracking of active items, geometrically dominant decomposition, and proximity-based pruning, achieving low space complexity without explicitly storing historical elements. Theoretically and empirically, our approach yields a superior diameter approximation ratio over prior methods; moreover, our *k*-center algorithm is the first to achieve a constant-factor approximation guarantee under the generalized expiry stream model—significantly extending both the applicability and performance frontier beyond sliding windows.

Technology Category

Application Category

📝 Abstract
An important thread in the study of data-stream algorithms focuses on settings where stream items are active only for a limited time. We introduce a new expiration model, where each item arrives with its own expiration time. The special case where items expire in the order that they arrive, which we call consistent expirations, contains the classical sliding-window model of Datar, Gionis, Indyk, and Motwani [SICOMP 2002] and its timestamp-based variant of Braverman and Ostrovsky [FOCS 2007]. Our first set of results presents algorithms (in the expiration streaming model) for several fundamental problems, including approximate counting, uniform sampling, and weighted sampling by efficiently tracking active items without explicitly storing them all. Naturally, these algorithms have many immediate applications to other problems. Our second and main set of results designs algorithms (in the expiration streaming model) for the diameter and $k$-center problems, where items are points in a metric space. Our results significantly extend those known for the special case of sliding-window streams by Cohen-Addad, Schwiegelshohn, and Sohler [ICALP 2016], including also a strictly better approximation factor for the diameter in the important special case of high-dimensional Euclidean space. We develop new decomposition and coordination techniques along with a geometric dominance framework, to filter out redundant points based on both temporal and spatial proximity.
Problem

Research questions and friction points this paper is trying to address.

Tracking active items with individual expiration times efficiently
Solving diameter and k-center problems in metric space streams
Developing algorithms for counting and sampling without full storage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tracking active items without storing them all
Using decomposition and coordination techniques for filtering
Applying geometric dominance framework for proximity
🔎 Similar Papers
No similar papers found.