RationAnomaly: Log Anomaly Detection with Rationality via Chain-of-Thought and Reinforcement Learning

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing log anomaly detection methods face dual challenges: conventional deep learning models suffer from limited interpretability and generalization, while large language models (LLMs) exhibit hallucination, factual inaccuracies, and low reliability. To address these issues, this paper proposes the first unified framework integrating chain-of-thought-guided supervised fine-tuning (CoT-SFT) with multi-dimensional reward reinforcement learning (RL). We construct a high-quality training dataset via expert calibration and design a dual-objective reward mechanism that jointly optimizes accuracy and logical consistency—thereby significantly enhancing reasoning coherence and suppressing hallucination. Evaluated on mainstream benchmarks, our method achieves state-of-the-art F1 scores and generates verifiable, step-by-step reasoning traces. The source code and dataset are publicly released.

Technology Category

Application Category

📝 Abstract
Logs constitute a form of evidence signaling the operational status of software systems. Automated log anomaly detection is crucial for ensuring the reliability of modern software systems. However, existing approaches face significant limitations: traditional deep learning models lack interpretability and generalization, while methods leveraging Large Language Models are often hindered by unreliability and factual inaccuracies. To address these issues, we propose RationAnomaly, a novel framework that enhances log anomaly detection by synergizing Chain-of-Thought (CoT) fine-tuning with reinforcement learning. Our approach first instills expert-like reasoning patterns using CoT-guided supervised fine-tuning, grounded in a high-quality dataset corrected through a rigorous expert-driven process. Subsequently, a reinforcement learning phase with a multi-faceted reward function optimizes for accuracy and logical consistency, effectively mitigating hallucinations. Experimentally, RationAnomaly outperforms state-of-the-art baselines, achieving superior F1-scores on key benchmarks while providing transparent, step-by-step analytical outputs. We have released the corresponding resources, including code and datasets.
Problem

Research questions and friction points this paper is trying to address.

Improving log anomaly detection accuracy and interpretability
Addressing unreliability and factual inaccuracies in existing methods
Combining Chain-of-Thought reasoning with reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chain-of-Thought fine-tuning for reasoning patterns
Reinforcement learning with multi-faceted reward function
Expert-corrected dataset for supervised fine-tuning
🔎 Similar Papers
No similar papers found.
Song Xu
Song Xu
JD AI Research
natural language processingtext generationrecommender systems
Y
Yilun Liu
Huawei, Beijing, China
M
Minggui He
Huawei, Beijing, China
M
Mingchen Dai
School of Software Engineering, University of Science and Technology of China, Hefei, China
Z
Ziang Chen
Nankai University, Tianjin, China
C
Chunguang Zhao
Huawei, Beijing, China
J
Jingzhou Du
Huawei, Beijing, China
Shimin Tao
Shimin Tao
2012 Lab, Huawei co. LTD
Machine Translation AIOps Log Analysis
W
Weibin Meng
Huawei, Beijing, China
Shenglin Zhang
Shenglin Zhang
Nankai University
AI Operations in general
Yongqian Sun
Yongqian Sun
Nankai University
AIOpsAnomaly DetectionFailure LocalizationMicroservices Fault DiagnosisRoot Cause Analysis
Boxing Chen
Boxing Chen
Huawei Technologies Canada
Natual Language ProcessingArtificial Intelligence
D
Daimeng Wei
Huawei, Beijing, China