Scalable Approximate Biclique Counting over Large Bipartite Graphs

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Exact $(p,q)$-biclique counting in large-scale bipartite graphs is computationally intractable due to combinatorial explosion. To address this, we propose the first scalable, unbiased approximation algorithm. Our core innovation introduces the $(p,q)$-broom—a tree-based structure specifically designed for bicliques—combined with graph coloring and dynamic programming to efficiently aggregate local statistics. Leveraging this structure, we construct a theoretically grounded, unbiased sampling estimator with provable error bounds. Extensive experiments on nine real-world datasets demonstrate that our method reduces estimation error by up to 8× and accelerates computation by up to 50× over state-of-the-art approaches. Crucially, it is the first technique enabling efficient, high-accuracy biclique approximation on massive bipartite graphs beyond the reach of prior methods.

Technology Category

Application Category

📝 Abstract

Counting $(p,q)$-bicliques in bipartite graphs is crucial for a variety of applications, from recommendation systems to cohesive subgraph analysis. Yet, it remains computationally challenging due to the combinatorial explosion to exactly count the $(p,q)$-bicliques. In many scenarios, e.g., graph kernel methods, however, exact counts are not strictly required. To design a scalable and high-quality approximate solution, we novelly resort to $(p,q)$-broom, a special spanning tree of the $(p,q)$-biclique, which can be counted via graph coloring and efficient dynamic programming. Based on the intermediate results of the dynamic programming, we propose an efficient sampling algorithm to derive the approximate $(p,q)$-biclique count from the $(p,q)$-broom counts. Theoretically, our method offers unbiased estimates with provable error guarantees. Empirically, our solution outperforms existing approximation techniques in both accuracy (up to 8$ imes$ error reduction) and runtime (up to 50$ imes$ speedup) on nine real-world bipartite networks, providing a scalable solution for large-scale $(p,q)$-biclique counting.

Problem

Research questions and friction points this paper is trying to address.

Scalable counting of (p,q)-bicliques in large bipartite graphs

Overcoming computational challenges via approximate solutions

Providing unbiased estimates with error guarantees efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses (p,q)-broom spanning tree for counting

Employs graph coloring and dynamic programming

Samples efficiently for approximate biclique counts

🔎 Similar Papers

Efficient Historical Butterfly Counting in Large Temporal Bipartite Networks via Graph Structure-aware Index