Automated Sentiment Classification and Topic Discovery in Large-Scale Social Media Streams

📅 2025-05-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of analyzing sentiment and thematic dynamics in large-scale Twitter streams during geopolitical conflicts, this paper proposes a sentiment–metadata joint-partitioning topic modeling framework. Our method integrates ensemble sentiment labeling from multiple pretrained models and metadata-aware LDA modeling—grouped along temporal, geographic, and sentiment dimensions—and implements an interactive spatiotemporal visualization system powered by D3.js. The key innovation lies in explicitly incorporating predicted sentiment labels into the prior structure of topic modeling, thereby strengthening interpretable associations between sentiment and topics. Evaluated on real-world conflict-related Twitter data, our approach achieves sentiment classification F1-scores ≥ 0.89, successfully identifies temporally evolving topic clusters, and enables cross-regional analysis of sentiment–topic co-dynamics. This significantly enhances both the depth and interpretability of social舆情 understanding under dynamic contextual conditions.

Technology Category

Application Category

📝 Abstract
We present a framework for large-scale sentiment and topic analysis of Twitter discourse. Our pipeline begins with targeted data collection using conflict-specific keywords, followed by automated sentiment labeling via multiple pre-trained models to improve annotation robustness. We examine the relationship between sentiment and contextual features such as timestamp, geolocation, and lexical content. To identify latent themes, we apply Latent Dirichlet Allocation (LDA) on partitioned subsets grouped by sentiment and metadata attributes. Finally, we develop an interactive visualization interface to support exploration of sentiment trends and topic distributions across time and regions. This work contributes a scalable methodology for social media analysis in dynamic geopolitical contexts.
Problem

Research questions and friction points this paper is trying to address.

Large-scale sentiment analysis of Twitter discourse
Topic discovery using Latent Dirichlet Allocation (LDA)
Interactive visualization of sentiment trends and topics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Targeted data collection using conflict-specific keywords
Automated sentiment labeling via multiple pre-trained models
Latent Dirichlet Allocation (LDA) for latent theme identification
🔎 Similar Papers
No similar papers found.