GAIS: A Novel Approach to Instance Selection with Graph Attention Networks

📅 2024-12-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Instance selection (IS) for graph-structured data remains underexplored, particularly in capturing complex inter-sample dependencies. Method: This paper proposes the first graph attention network (GAT)-based structural-aware IS method: samples are modeled as graph nodes; a block-wise graph construction, random masking, and dynamic k-nearest-neighbor similarity thresholding jointly encode intricate sample relationships; node-level confidence scores are learned via GAT to identify high-informativeness instances. Contribution/Results: It pioneers GAT integration into IS and introduces a scalable, graph-structured evaluation framework. Evaluated on 13 benchmark datasets, the method achieves an average 96% data compression rate while maintaining or improving classification accuracy—significantly outperforming conventional IS approaches. Results demonstrate that explicit graph-structural modeling substantially enhances both the effectiveness and generalizability of instance importance assessment.

Technology Category

Application Category

📝 Abstract
Instance selection (IS) is a crucial technique in machine learning that aims to reduce dataset size while maintaining model performance. This paper introduces a novel method called Graph Attention-based Instance Selection (GAIS), which leverages Graph Attention Networks (GATs) to identify the most informative instances in a dataset. GAIS represents the data as a graph and uses GATs to learn node representations, enabling it to capture complex relationships between instances. The method processes data in chunks, applies random masking and similarity thresholding during graph construction, and selects instances based on confidence scores from the trained GAT model. Experiments on 13 diverse datasets demonstrate that GAIS consistently outperforms traditional IS methods in terms of effectiveness, achieving high reduction rates (average 96%) while maintaining or improving model performance. Although GAIS exhibits slightly higher computational costs, its superior performance in maintaining accuracy with significantly reduced training data makes it a promising approach for graph-based data selection.
Problem

Research questions and friction points this paper is trying to address.

Data Reduction
Machine Learning
Graph Data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Attention Networks
Instance Selection
Data Reduction
🔎 Similar Papers
No similar papers found.