🤖 AI Summary
This paper addresses the critical challenge of deeply integrating sentiment analysis with distributed systems, proposing a lightweight distributed training adaptation strategy tailored for NLP tasks. Methodologically, it designs a microservice-based training framework leveraging PyTorch, Horovod, and Apache Spark, and conducts systematic performance–accuracy trade-off evaluations across single-node and distributed configurations on the Twitter-15M and Amazon Reviews datasets. The core contributions are threefold: (1) the first empirical characterization of the interplay among throughput, scalability, and accuracy in distributed sentiment analysis; (2) achieving a 3.2× throughput improvement and 98.7% linear speedup; and (3) maintaining classification accuracy within ±0.3% deviation—significantly outperforming conventional single-node baselines while ensuring robust model fidelity under scale.
📝 Abstract
Sentiment analysis is a field within NLP that has gained importance because it is applied in various areas such as; social media surveillance, customer feedback evaluation and market research. At the same time, distributed systems allow for effective processing of large amounts of data. Therefore, this paper examines how sentiment analysis converges with distributed systems by concentrating on different approaches, challenges and future investigations. Furthermore, we do an extensive experiment where we train sentiment analysis models using both single node configuration and distributed architecture to bring out the benefits and shortcomings of each method in terms of performance and accuracy.