Distribution Matching for Graph Quantification Under Structural Covariate Shift

📅 2025-12-30

🏛️ ECML/PKDD

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of structural covariate shift in graph data, which often induces bias in label distribution estimation and significantly degrades the performance of conventional quantification learning methods when training and test subgraphs exhibit structural discrepancies. To mitigate this issue, the authors propose integrating a structural importance sampling mechanism into the kernel density estimation–based KDEy quantification framework. By aligning the label-conditional distributions between training and test subgraphs, the method effectively alleviates the adverse effects of structural shift. Notably, this is the first approach to incorporate structural importance sampling into KDEy, thereby circumventing the limitations of traditional methods that rely on prior assumptions about probability shift. Experimental results demonstrate that the proposed method substantially outperforms existing quantification approaches on graph data with structural shifts, achieving higher accuracy and robustness in label distribution estimation.

Technology Category

Application Category

📝 Abstract

Graphs are commonly used in machine learning to model relationships between instances. Consider the task of predicting the political preferences of users in a social network; to solve this task one should consider, both, the features of each individual user and the relationships between them. However, oftentimes one is not interested in the label of a single instance but rather in the distribution of labels over a set of instances; e.g., when predicting the political preferences of users, the overall prevalence of a given opinion might be of higher interest than the opinion of a specific person. This label prevalence estimation task is commonly referred to as quantification learning (QL). Current QL methods for tabular data are typically based on the so-called prior probability shift (PPS) assumption which states that the label-conditional instance distributions should remain equal across the training and test data. In the graph setting, PPS generally does not hold if the shift between training and test data is structural, i.e., if the training data comes from a different region of the graph than the test data. To address such structural shifts, an importance sampling variant of the popular adjusted count quantification approach has previously been proposed. In this work, we extend the idea of structural importance sampling to the state-of-the-art KDEy quantification approach. We show that our proposed method adapts to structural shifts and outperforms standard quantification approaches.

Problem

Research questions and friction points this paper is trying to address.

quantification learning

structural covariate shift

graph data

label distribution estimation

prior probability shift

Innovation

Methods, ideas, or system contributions that make the work stand out.

quantification learning

structural covariate shift

importance sampling

graph data

KDEy

🔎 Similar Papers

No similar papers found.

Authors to Follow