🤖 AI Summary
This work proposes and systematically investigates the “careless coupon collector” problem, wherein each collected coupon type is independently lost in every round with probability \( p \). The model captures realistic scenarios in data collection systems where information is lost due to failures or forgetting. By employing probabilistic analysis, Markov chain modeling, and asymptotic methods, the authors design an \( O(n^2) \)-time algorithm to compute the exact expected time to collect all \( n \) coupon types. Their analysis reveals a multi-stage phase transition in the collection time as \( p \) varies: transitioning from the classical \( \Theta(n \log n) \) regime to an exponential \( \Theta\left((np/(1-p))^n\right) \) regime. Notably, when \( p = c/n \), they identify a metastable concentration phenomenon lasting \( e^{\Theta(n)} \) rounds, thereby uncovering the problem’s rich and complex dynamical behavior for the first time.
📝 Abstract
We initiate the study of the Careless Coupon Collector's Problem (CCCP), a novel variation of the classical coupon collector, that we envision as a model for information systems such as web crawlers, dynamic caches, and fault-resilient networks. In CCCP, a collector attempts to gather $n$ distinct coupon types by obtaining one coupon type uniformly at random in each discrete round, however the collector is \textit{careless}: at the end of each round, each collected coupon type is independently lost with probability $p$. We analyze the number of rounds required to complete the collection as a function of $n$ and $p$. In particular, we show that it transitions from $\Theta(n \ln n)$ when $p = o\big(\frac{\ln n}{n^2}\big)$ up to $\Theta\big((\frac{np}{1-p})^n\big)$ when $p=\omega\big(\frac{1}{n}\big)$ in multiple distinct phases. Interestingly, when $p=\frac{c}{n}$, the process remains in a metastable phase, where the fraction of collected coupon types is concentrated around $\frac{1}{1+c}$ with probability $1-o(1)$, for a time window of length $e^{\Theta(n)}$. Finally, we give an algorithm that computes the expected completion time of CCCP in $O(n^2)$ time.