Finding coherent node groups in directed graphs

๐Ÿ“… 2023-10-04
๐Ÿ›๏ธ PKDD/ECML Workshops
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper studies the problem of ordered, feature-aware node grouping in directed graphs: partitioning a feature-labeled graph into coherent groups while asymmetrically penalizing forward and backward inter-group edgesโ€”where backward-edge penalties may be non-zeroโ€”to jointly model both unordered clustering and ordered segmentation. We formally define this ordered, feature-consistent, flow-constrained grouping problem, prove its NP-hardness, and devise polynomial-time exact algorithms for trees and bipartite graphs. Our methodological contributions include a dynamic programming algorithm for trees, a minimum-cut-based approach for bipartite graphs, a pairwise optimization heuristic, and a greedy migration strategy. Experiments on real-world directed networks demonstrate that our approach produces interpretable hierarchical groupings, significantly suppresses backward inter-group edges, and outperforms conventional clustering and linear ordering baselines in both structural coherence and task-specific utility.
๐Ÿ“ Abstract
Summarizing a large graph by grouping the nodes into clusters is a standard technique for studying the given network. Traditionally, the order of the discovered groups does not matter. However, there are applications where, for example, given a directed graph, we would like to find coherent groups while minimizing the backward cross edges. More formally, in this paper, we study a problem where we are given a directed network and are asked to partition the graph into a sequence of coherent groups while attempting to conform to the cross edges. We assume that nodes in the network have features, and we measure the group coherence by comparing these features. Furthermore, we incorporate the cross edges by penalizing the forward cross edges and backward cross edges with different weights. If the weights are set to 0, then the problem is equivalent to clustering. However, if we penalize the backward edges significantly more, then the order of discovered groups matters, and we can view our problem as a generalization of a classic segmentation problem. To solve the algorithm we consider a common iterative approach where we solve the groups given the centroids, and then find the centroids given the groups. We show that - unlike in clustering - the first subproblem is NP-hard. However, we show that if the underlying graph is a tree we can solve the subproblem with dynamic programming. In addition, if the number of groups is 2, we can solve the subproblem with a minimum cut. For the more general case, we propose a heuristic where we optimize each pair of groups separately while keeping the remaining groups intact. We also propose a greedy search where nodes are moved between the groups while optimizing the overall loss. We demonstrate with our experiments that the algorithms are practical and yield interpretable results.
Problem

Research questions and friction points this paper is trying to address.

Partition directed graphs into coherent node groups using features
Generalize clustering by penalizing cross edges with different weights
Solve NP-hard subproblem via approximation algorithms for general cases
Innovation

Methods, ideas, or system contributions that make the work stand out.

Penalizes forward and backward cross edges differently
Uses linear programming for approximation in general cases
Proposes three heuristics for optimizing group coherence
๐Ÿ”Ž Similar Papers
No similar papers found.