🤖 AI Summary
Current large language models (LLMs) lack expert-aligned, structured reasoning capabilities for log analysis, hindering the generation of fine-grained, interpretable diagnostic steps. To address this, we propose LogReasoner—a novel two-stage reasoning enhancement framework that bridges coarse-grained to fine-grained inference. First, it constructs a high-level reasoning skeleton grounded in expert-derived workflow diagrams; second, it refines low-level reasoning details via task-specific stepwise fine-tuning and preference learning. Implemented on open-source LLMs including Qwen-2.5 and Llama-3, LogReasoner supports four log analysis tasks: anomaly detection, root-cause diagnosis, failure localization, and remediation planning. Extensive evaluation demonstrates significant improvements in both accuracy and reasoning interpretability over state-of-the-art baselines. Our results validate that structured reasoning augmentation effectively enhances LLMs’ log analysis capabilities while maintaining strong generalizability across diverse tasks and model architectures.
📝 Abstract
Log analysis is crucial for monitoring system health and diagnosing failures in complex systems. Recent advances in large language models (LLMs) offer new opportunities for automated log analysis, leveraging their reasoning capabilities to perform tasks such as anomaly detection and failure prediction. However, general-purpose LLMs struggle to formulate structured reasoning workflows that align with expert cognition and deliver precise details of reasoning steps. To address these challenges, we propose LogReasoner, a coarse-to-fine reasoning enhancement framework designed to enable LLMs to reason log analysis tasks like experts. LogReasoner consists of two stages: (1) coarse-grained enhancement of expert thinking, where high-level expert thoughts are constructed from collected troubleshooting flowcharts and existing tasks to enable LLMs to formulate structured reasoning workflows and (2) fine-grained enhancement of specific steps, where we first fine-tune the LLM with task-specific stepwise solutions to enhance the LLM for instantiated reasoning, then employ the preference learning to calibrate the LLM's reasoning details from its mistakes, further strengthen the LLM's analytical granularity and correctness. We evaluate LogReasoner on four distinct log analysis tasks using open-source LLMs such as Qwen-2.5 and Llama-3. Experimental results show that LogReasoner significantly outperforms existing LLMs, achieving state-of-the-art performance and demonstrating its effectiveness in enhancing the reasoning capabilities of LLMs for log analysis.