RationAnomaly: Log Anomaly Detection with Rationality via Chain-of-Thought and Reinforcement Learning

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Existing log anomaly detection methods face dual challenges: conventional deep learning models suffer from limited interpretability and generalization, while large language models (LLMs) exhibit hallucination, factual inaccuracies, and low reliability. To address these issues, this paper proposes the first unified framework integrating chain-of-thought-guided supervised fine-tuning (CoT-SFT) with multi-dimensional reward reinforcement learning (RL). We construct a high-quality training dataset via expert calibration and design a dual-objective reward mechanism that jointly optimizes accuracy and logical consistency—thereby significantly enhancing reasoning coherence and suppressing hallucination. Evaluated on mainstream benchmarks, our method achieves state-of-the-art F1 scores and generates verifiable, step-by-step reasoning traces. The source code and dataset are publicly released.

Technology Category

Application Category

📝 Abstract

Logs constitute a form of evidence signaling the operational status of software systems. Automated log anomaly detection is crucial for ensuring the reliability of modern software systems. However, existing approaches face significant limitations: traditional deep learning models lack interpretability and generalization, while methods leveraging Large Language Models are often hindered by unreliability and factual inaccuracies. To address these issues, we propose RationAnomaly, a novel framework that enhances log anomaly detection by synergizing Chain-of-Thought (CoT) fine-tuning with reinforcement learning. Our approach first instills expert-like reasoning patterns using CoT-guided supervised fine-tuning, grounded in a high-quality dataset corrected through a rigorous expert-driven process. Subsequently, a reinforcement learning phase with a multi-faceted reward function optimizes for accuracy and logical consistency, effectively mitigating hallucinations. Experimentally, RationAnomaly outperforms state-of-the-art baselines, achieving superior F1-scores on key benchmarks while providing transparent, step-by-step analytical outputs. We have released the corresponding resources, including code and datasets.

Problem

Research questions and friction points this paper is trying to address.

Improving log anomaly detection accuracy and interpretability

Addressing unreliability and factual inaccuracies in existing methods

Combining Chain-of-Thought reasoning with reinforcement learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Chain-of-Thought fine-tuning for reasoning patterns

Reinforcement learning with multi-faceted reward function

Expert-corrected dataset for supervised fine-tuning

🔎 Similar Papers

What Information Contributes to Log-based Anomaly Detection? Insights from a Configurable Transformer-Based Approach