Efficient Topic Extraction via Graph-Based Labeling: A Lightweight Alternative to Deep Models

📅 2025-11-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational cost and poor interpretability of deep models in topic modeling, this paper proposes a lightweight graph-enhanced topic labeling method. Unlike LLM-dependent approaches, it constructs a semantic word graph over pre-trained word embeddings, leverages graph propagation to expand topic vocabularies and uncover structured semantic relationships among words, and generates concise, semantically coherent topic labels. Experiments on two standard benchmarks demonstrate that our method significantly outperforms traditional baselines (e.g., LDA+KeyBERT) in label quality—measured by BERTScore and cosine similarity—and matches ChatGPT-3.5’s performance, while achieving 3–5× faster inference and reducing memory consumption by ~70%. Our core contribution is the first integration of graph-structured modeling into topic labeling, enabling efficient, low-resource, and highly interpretable automatic label generation.

Technology Category

Application Category

📝 Abstract
Extracting topics from text has become an essential task, especially with the rapid growth of unstructured textual data. Most existing works rely on highly computational methods to address this challenge. In this paper, we argue that probabilistic and statistical approaches, such as topic modeling (TM), can offer effective alternatives that require fewer computational resources. TM is a statistical method that automatically discovers topics in large collections of unlabeled text; however, it produces topics as distributions of representative words, which often lack clear interpretability. Our objective is to perform topic labeling by assigning meaningful labels to these sets of words. To achieve this without relying on computationally expensive models, we propose a graph-based approach that not only enriches topic words with semantically related terms but also explores the relationships among them. By analyzing these connections within the graph, we derive suitable labels that accurately capture each topic's meaning. We present a comparative study between our proposed method and several benchmarks, including ChatGPT-3.5, across two different datasets. Our method achieved consistently better results than traditional benchmarks in terms of BERTScore and cosine similarity and produced results comparable to ChatGPT-3.5, while remaining computationally efficient. Finally, we discuss future directions for topic labeling and highlight potential research avenues for enhancing interpretability and automation.
Problem

Research questions and friction points this paper is trying to address.

Extracting meaningful labels from computationally intensive topic modeling outputs
Enhancing topic interpretability through graph-based semantic relationship analysis
Providing lightweight alternative to deep learning models for topic labeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based labeling for topic extraction
Lightweight alternative to deep learning models
Semantic enrichment of topic words via graphs
🔎 Similar Papers
2024-04-02North American Chapter of the Association for Computational LinguisticsCitations: 2
S
Salma Mekaooui
University of Limerick, Department of Computer Science and Information Systems, Ireland
H
Hiba Sofyan
Euromed University Of Fez, School of Digital Engineering and Artificial Intelligence, Morocco
I
Imane Amaaz
Euromed University Of Fez, School of Digital Engineering and Artificial Intelligence, Morocco
I
Imane Benchrif
Euromed University Of Fez, School of Digital Engineering and Artificial Intelligence, Morocco
A
A. Zarghili
Faculty of Sciences and Technology, University Sidi Mohamed Ben Abdellah, Morocco
I
Ilham Chaker
Faculty of Sciences and Technology, University Sidi Mohamed Ben Abdellah, Morocco
Nikola S. Nikolov
Nikola S. Nikolov
Associate Professor, Department of Computer Science and Information Systems, University of Limerick
Machine LearningNLPGraph Drawing