๐ค AI Summary
This study addresses the challenges of high annotation costs and error-proneness in the automatic identification of concurrency bug reports. The authors propose a multi-granularity linguistic patternโbased classification framework that constructs 58 domain-specific language patterns across four levels: word, phrase, sentence, and report. The approach integrates pattern matching, traditional machine learning, fine-tuning of pre-trained language models (PLMs), and prompting of large language models (LLMs). Notably, domain-specific linguistic knowledge is innovatively injected into the PLM fine-tuning process. The work also releases a high-quality annotated dataset. Experimental results demonstrate that the method achieves precision rates of 91% and 93% on GitHub and Jira datasets, respectively, and maintains a high precision of 91% on hold-out test data.
๐ Abstract
With the growing ubiquity of multi-core architectures, concurrent systems have become essential but increasingly prone to complex issues such as data races and deadlocks. While modern issue-tracking systems facilitate the reporting of such problems, labeling concurrency-related bug reports remains a labor-intensive and error-prone task. This paper presents a linguistic-pattern-based framework for automatically identifying concurrency bug reports. We derive 58 distinct linguistic patterns from 730 manually labeled concurrency bug reports, organized across four levels: word-level (keywords), phrase-level (n-grams), sentence-level (semantic), and bug report-level (contextual). To assess their effectiveness, we evaluate four complementary approaches-matching, learning, prompt-based, and fine-tuning-spanning traditional machine learning, large language models (LLMs), and pre-trained language models (PLMs). Our comprehensive evaluation on 12 large-scale open-source projects (10,920 issue reports from GitHub and Jira) demonstrates that fine-tuning PLMs with linguistic-pattern-enriched inputs achieves the best performance, reaching a precision of 91% on GitHub and 93% on Jira, and maintaining strong precision on post cut-off data (91%). The contributions of this work include: (1) a comprehensive taxonomy of linguistic patterns for concurrency bugs, (2) a novel fine-tuning strategy that integrates domain-specific linguistic knowledge into PLMs, and (3) a curated, labeled dataset to support reproducible research. Together, these advances provide a foundation for improving the automation, precision, and interpretability of concurrency bug classification.