🤖 AI Summary
This work addresses the inefficiency caused by incomplete defect reports—often lacking critical details such as reproduction steps or descriptions of actual versus expected behavior—and proposes an automated enhancement approach leveraging large language models. The method integrates fine-tuned DistilBERT, few-shot prompting, heuristic rules, and retrieval-augmented generation (RAG), enriched with domain-specific knowledge from the Minecraft Wiki to enable context-aware repair. Evaluated on the Mojira dataset, the approach dramatically improves report quality: structural completeness rises from 7.9% to 96.4%, the proportion of executable reproduction steps increases from 28.8% to 67.6%, and the number of fully reproducible reports grows from one to thirteen, substantially enhancing both the usability and actionability of defect reports.
📝 Abstract
Bug tracking systems play a crucial role in software maintenance, yet developers frequently struggle with low-quality user-submitted reports that omit essential details such as Steps to Reproduce (S2R), Observed Behavior (OB), and Expected Behavior (EB). We propose ImproBR, an LLM-based pipeline that automatically detects and improves bug reports by addressing missing, incomplete, and ambiguous S2R, OB, and EB sections. ImproBR employs a hybrid detector combining fine-tuned DistilBERT, heuristic analysis, and an LLM analyzer, guided by GPT-4o mini with section-specific few-shot prompts and a Retrieval-Augmented Generation (RAG) pipeline grounded in Minecraft Wiki domain knowledge. Evaluated on Mojira, ImproBR improved structural completeness from 7.9% to 96.4%, more than doubled the proportion of executable S2R from 28.8% to 67.6%, and raised fully reproducible bug reports from 1 to 13 across 139 challenging real-world reports.