🤖 AI Summary
Existing aggregation methods for continuously arriving ranked data in dynamic environments suffer from high computational overhead and low accuracy. This paper proposes LR-Aggregation, a novel framework comprising an LR-tree index structure, an LR-distance metric, and the Pick-A-Perm approximation algorithm—marking the first dynamic ranking aggregation approach with both theoretical guarantees and practical efficiency. It supports incremental updates in *O*(*n* log *n*) time, achieves a worst-case 2-approximation ratio, and provides provable bounds on approximation quality. Experiments on real-world and synthetic datasets demonstrate that LR-Aggregation consistently outperforms state-of-the-art methods: it is 1.5–3× faster and reduces average error by over 30%. To our knowledge, it is the first streaming ranking aggregation solution offering near-linear time complexity, a certified approximation ratio, and strong empirical performance.
📝 Abstract
The rank aggregation problem, which has many real-world applications, refers to the process of combining multiple input rankings into a single aggregated ranking. In dynamic settings, where new rankings arrive over time, efficiently updating the aggregated ranking is essential. This paper develops a fast, theoretically and practically efficient dynamic rank aggregation algorithm. First, we develop the LR-Aggregation algorithm, built on top of the LR-tree data structure, which is itself modeled on the LR-distance, a novel and equivalent take on the classical Spearman's footrule distance. We then analyze the theoretical efficiency of the Pick-A-Perm algorithm, and show how it can be combined with the LR-aggregation algorithm using another data structure that we develop. We demonstrate through experimental evaluations that LR-Aggregation produces close to optimal solutions in practice. We show that Pick-A-Perm has a theoretical worst case approximation guarantee of 2. We also show that both the LR-Aggregation and Pick-A-Perm algorithms, as well as the methodology for combining them can be run in $O(n log n)$ time. To the best of our knowledge, this is the first fast, near linear time rank aggregation algorithm in the dynamic setting, having both a theoretical approximation guarantee, and excellent practical performance (much better than the theoretical guarantee).