FSD-CAP: Fractional Subgraph Diffusion with Class-Aware Propagation for Graph Feature Imputation

📅 2026-01-26

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses the challenging problem of node feature imputation under extreme missing rates (e.g., 99.5%) in graph-structured data. The authors propose a two-stage framework: first, structure-adaptive local feature propagation is achieved through graph distance-guided subgraph expansion combined with a fractional-order diffusion operator, effectively mitigating error propagation; second, class-aware feature refinement is performed by integrating pseudo-labeling with neighborhood entropy regularization. The method maintains computational efficiency while significantly improving imputation consistency and downstream task performance. Experimental results demonstrate that, even with 99.5% of features missing, the approach achieves node classification accuracy of 80.06%–81.01% and link prediction AUC of 91.65%–92.41%, closely matching the performance of a full-feature GCN, and exhibits superior robustness on large-scale and heterophilic graphs.

Technology Category

Application Category

📝 Abstract

Imputing missing node features in graphs is challenging, particularly under high missing rates. Existing methods based on latent representations or global diffusion often fail to produce reliable estimates, and may propagate errors across the graph. We propose FSD-CAP, a two-stage framework designed to improve imputation quality under extreme sparsity. In the first stage, a graph-distance-guided subgraph expansion localizes the diffusion process. A fractional diffusion operator adjusts propagation sharpness based on local structure. In the second stage, imputed features are refined using class-aware propagation, which incorporates pseudo-labels and neighborhood entropy to promote consistency. We evaluated FSD-CAP on multiple datasets. With $99.5\%$ of features missing across five benchmark datasets, FSD-CAP achieves average accuracies of $80.06\%$ (structural) and $81.01\%$ (uniform) in node classification, close to the $81.31\%$ achieved by a standard GCN with full features. For link prediction under the same setting, it reaches AUC scores of $91.65\%$ (structural) and $92.41\%$ (uniform), compared to $95.06\%$ for the fully observed case. Furthermore, FSD-CAP demonstrates superior performance on both large-scale and heterophily datasets when compared to other models.

Problem

Research questions and friction points this paper is trying to address.

graph feature imputation

missing node features

high missing rates

graph diffusion

feature sparsity

Innovation

Methods, ideas, or system contributions that make the work stand out.

fractional diffusion

class-aware propagation

graph feature imputation