GAIS: A Novel Approach to Instance Selection with Graph Attention Networks

📅 2024-12-26

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

Instance selection (IS) for graph-structured data remains underexplored, particularly in capturing complex inter-sample dependencies. Method: This paper proposes the first graph attention network (GAT)-based structural-aware IS method: samples are modeled as graph nodes; a block-wise graph construction, random masking, and dynamic k-nearest-neighbor similarity thresholding jointly encode intricate sample relationships; node-level confidence scores are learned via GAT to identify high-informativeness instances. Contribution/Results: It pioneers GAT integration into IS and introduces a scalable, graph-structured evaluation framework. Evaluated on 13 benchmark datasets, the method achieves an average 96% data compression rate while maintaining or improving classification accuracy—significantly outperforming conventional IS approaches. Results demonstrate that explicit graph-structural modeling substantially enhances both the effectiveness and generalizability of instance importance assessment.

Technology Category

Application Category

📝 Abstract

Instance selection (IS) is a crucial technique in machine learning that aims to reduce dataset size while maintaining model performance. This paper introduces a novel method called Graph Attention-based Instance Selection (GAIS), which leverages Graph Attention Networks (GATs) to identify the most informative instances in a dataset. GAIS represents the data as a graph and uses GATs to learn node representations, enabling it to capture complex relationships between instances. The method processes data in chunks, applies random masking and similarity thresholding during graph construction, and selects instances based on confidence scores from the trained GAT model. Experiments on 13 diverse datasets demonstrate that GAIS consistently outperforms traditional IS methods in terms of effectiveness, achieving high reduction rates (average 96%) while maintaining or improving model performance. Although GAIS exhibits slightly higher computational costs, its superior performance in maintaining accuracy with significantly reduced training data makes it a promising approach for graph-based data selection.

Problem

Research questions and friction points this paper is trying to address.

Data Reduction

Machine Learning

Graph Data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Attention Networks

Instance Selection

Data Reduction

🔎 Similar Papers

LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations