Combining LLM Semantic Reasoning with GNN Structural Modeling for Multi-view Multi-Label Feature Selection

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This paper addresses the multi-view multi-label feature selection problem by proposing the first dual-layer heterogeneous graph modeling framework that integrates large language models (LLMs) and graph neural networks (GNNs). Methodologically, it constructs two complementary layers: a semantic layer—a label-feature semantic graph driven by an LLM—and a statistical layer—a heterogeneous graph capturing multi-view co-occurrence patterns. A lightweight graph attention network jointly learns node embeddings across both layers to produce feature saliency scores. The key contribution lies in the novel integration of LLMs into this task, enabling synergistic optimization of semantic reasoning and structural modeling. Extensive experiments on multiple benchmark datasets demonstrate significant improvements over state-of-the-art methods, particularly under few-shot settings, where the approach exhibits strong robustness and superior generalization performance.

Technology Category

Application Category

📝 Abstract

Multi-view multi-label feature selection aims to identify informative features from heterogeneous views, where each sample is associated with multiple interdependent labels. This problem is particularly important in machine learning involving high-dimensional, multimodal data such as social media, bioinformatics or recommendation systems. Existing Multi-View Multi-Label Feature Selection (MVMLFS) methods mainly focus on analyzing statistical information of data, but seldom consider semantic information. In this paper, we aim to use these two types of information jointly and propose a method that combines Large Language Models (LLMs) semantic reasoning with Graph Neural Networks (GNNs) structural modeling for MVMLFS. Specifically, the method consists of three main components. (1) LLM is first used as an evaluation agent to assess the latent semantic relevance among feature, view, and label descriptions. (2) A semantic-aware heterogeneous graph with two levels is designed to represent relations among features, views and labels: one is a semantic graph representing semantic relations, and the other is a statistical graph. (3) A lightweight Graph Attention Network (GAT) is applied to learn node embedding in the heterogeneous graph as feature saliency scores for ranking and selection. Experimental results on multiple benchmark datasets demonstrate the superiority of our method over state-of-the-art baselines, and it is still effective when applied to small-scale datasets, showcasing its robustness, flexibility, and generalization ability.

Problem

Research questions and friction points this paper is trying to address.

Selecting informative features from multi-view multi-label heterogeneous data

Integrating semantic reasoning with structural modeling for feature selection

Addressing limitations of statistical-only approaches in high-dimensional multimodal data

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM evaluates semantic relevance among features, views, labels

Semantic-aware heterogeneous graph models statistical and semantic relations

Lightweight GAT learns embeddings for feature ranking and selection

🔎 Similar Papers

No similar papers found.