Towards Tight Bounds for Estimating Degree Distribution in Streaming and Query Models

📅 2025-07-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the sublinear-time approximation of the Complementary Cumulative Degree Histogram (CCDH) in graphs—a long-standing open problem regarding exact complexity under both the streaming and adjacency query models. To overcome the lack of tight theoretical bounds and a unified framework in prior work, we propose a dual-criterion multiplicative approximation model and design the first universal algorithm achieving optimal (or nearly optimal) sample/query complexity in both models. By integrating synergistic random sampling of vertices and edges, we establish the first tight upper and lower bounds for CCDH estimation, fully characterizing its sublinear computational complexity—thereby resolving the open problem posed at WOLA 2019. Our theoretical analysis is tightly coupled with algorithmic construction, ensuring both mathematical rigor and practical applicability.

Technology Category

Application Category

📝 Abstract
The degree distribution of a graph $G=(V,E)$, $|V|=n$, $|E|=m$ is one of the most fundamental objects of study in the analysis of graphs as it embodies relationship among entities. In particular, an important derived distribution from degree distribution is the complementary cumulative degree histogram (ccdh). The ccdh is a fundamental summary of graph structure, capturing, for each threshold $d$, the number of vertices with degree at least $d$. For approximating ccdh, we consider the $(varepsilon_D,varepsilon_R)$-BiCriteria Multiplicative Approximation, which allows for controlled multiplicative slack in both the domain and the range. The exact complexity of the problem was not known and had been posed as an open problem in WOLA 2019 [Sublinear.info, Problem 98]. In this work, we first design an algorithm that can approximate ccdh if a suitable vertex sample and an edge sample can be obtained and thus, the algorithm is independent of any sublinear model. Next, we show that in the streaming and query models, these samples can be obtained efficiently. On the other end, we establish the first lower bounds for this problem in both query and streaming models, and (almost) settle the complexity of the problem across both the sublinear models.
Problem

Research questions and friction points this paper is trying to address.

Estimating degree distribution in streaming and query models
Approximating complementary cumulative degree histogram (ccdh)
Establishing lower bounds for ccdh approximation complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vertex and edge sampling for ccdh approximation
Efficient sampling in streaming and query models
Establishing lower bounds for query and streaming
🔎 Similar Papers
No similar papers found.
A
Arijit Bishnu
Indian Statistical Institute, Kolkata, India
D
Debarshi Chanda
Indian Statistical Institute, Kolkata, India
Gopinath Mishra
Gopinath Mishra
Post Doctoral Fellow
Model centric computation