🤖 AI Summary
Configurable systems frequently suffer from erroneous configurations and latent defects due to the complexity of their configuration spaces; existing diagnostic approaches predominantly focus on post-failure analysis and overlook the diagnostic potential embedded in software logs for configuration-related issues. This paper proposes a configuration-aware log enhancement method: it employs static taint analysis to identify propagation paths of critical configuration variables and leverages large language models (LLMs) to generate context-sensitive, semantically rich diagnostic log statements, while optimizing variable capture and code-context modeling. To our knowledge, this is the first automated configuration-log generation framework integrating static data-flow analysis with LLM-based semantic understanding. Evaluated across eight widely used systems, the method achieves 100% localization accuracy for 30 categories of silent misconfigurations, enables direct repair for 80% of such issues, improves diagnostic efficiency by 1.25×, and increases diagnostic accuracy by 251.4%.
📝 Abstract
Modern configurable systems offer customization via intricate configuration spaces, yet such flexibility introduces pervasive configuration-related issues such as misconfigurations and latent softwarebugs. Existing diagnosability supports focus on post-failure analysis of software behavior to identify configuration issues, but none of these approaches look into whether the software clue sufficient failure information for diagnosis. To fill in the blank, we propose the idea of configuration logging to enhance existing logging practices at the source code level. We develop ConfLogger, the first tool that unifies configuration-aware static taint analysis with LLM-based log generation to enhance software configuration diagnosability. Specifically, our method 1) identifies configuration-sensitive code segments by tracing configuration-related data flow in the whole project, and 2) generates diagnostic log statements by analyzing configuration code contexts. Evaluation results on eight popular software systems demonstrate the effectiveness of ConfLogger to enhance configuration diagnosability. Specifically, ConfLogger-enhanced logs successfully aid a log-based misconfiguration diagnosis tool to achieve 100% accuracy on error localization in 30 silent misconfiguration scenarios, with 80% directly resolvable through explicit configuration information exposed. In addition, ConfLogger achieves 74% coverage of existing logging points, outperforming baseline LLM-based loggers by 12% and 30%. It also gains 8.6% higher in precision, 79.3% higher in recall, and 26.2% higher in F1 compared to the state-of-the-art baseline in terms of variable logging while also augmenting diagnostic value. A controlled user study on 22 cases further validated its utility, speeding up diagnostic time by 1.25x and improving troubleshooting accuracy by 251.4%.