Towards Trustworthy Sentiment Analysis in Software Engineering: Dataset Characteristics and Tool Selection

📅 2025-07-02

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Sentiment analysis tools exhibit significant performance variability across heterogeneous software engineering data sources (e.g., GitHub, Stack Overflow), primarily due to divergent developer communication styles and dataset distribution shifts—hindering accurate team dynamics modeling and trustworthy AI-driven requirements engineering. Method: We propose a data-characteristic-aware tool recommendation framework: (1) systematically extract linguistic and statistical features from 14 state-of-the-art tools—including SetFit and RoBERTa-based models—across multi-source datasets; (2) learn interpretable mappings from data characteristics to tool performance; and (3) deploy a lightweight questionnaire to facilitate rapid feature annotation for unseen scenarios. Contribution/Results: Empirical evaluation confirms that dataset characteristics strongly influence tool efficacy. Our approach improves tool selection accuracy by 23.6%, markedly enhancing the reliability, contextual adaptability, and interpretability of sentiment analysis outcomes—establishing a novel paradigm for trustworthy sentiment computing in software engineering.

Technology Category

Application Category

📝 Abstract

Software development relies heavily on text-based communication, making sentiment analysis a valuable tool for understanding team dynamics and supporting trustworthy AI-driven analytics in requirements engineering. However, existing sentiment analysis tools often perform inconsistently across datasets from different platforms, due to variations in communication style and content. In this study, we analyze linguistic and statistical features of 10 developer communication datasets from five platforms and evaluate the performance of 14 sentiment analysis tools. Based on these results, we propose a mapping approach and questionnaire that recommends suitable sentiment analysis tools for new datasets, using their characteristic features as input. Our results show that dataset characteristics can be leveraged to improve tool selection, as platforms differ substantially in both linguistic and statistical properties. While transformer-based models such as SetFit and RoBERTa consistently achieve strong results, tool effectiveness remains context-dependent. Our approach supports researchers and practitioners in selecting trustworthy tools for sentiment analysis in software engineering, while highlighting the need for ongoing evaluation as communication contexts evolve.

Problem

Research questions and friction points this paper is trying to address.

Inconsistent sentiment analysis tool performance across diverse datasets

Need for trustworthy tool selection in software engineering sentiment analysis

Mapping dataset characteristics to recommend suitable sentiment analysis tools

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes linguistic features across multiple datasets

Proposes mapping approach for tool selection

Recommends context-dependent trustworthy sentiment tools

🔎 Similar Papers

No similar papers found.

ByteDance

圣何塞

Senior Software Engineer, AI Infrastructure - Developer Tooling

ByteDance

西雅图

Software Engineer