LogAction: Consistent Cross-system Anomaly Detection through Logs via Active Domain

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing three key challenges in cross-system log anomaly detection—high annotation cost, severe source-to-target domain distribution shift, and cold-start difficulty—this paper proposes LogAction, a lightweight detection framework based on active domain adaptation. Methodologically, LogAction synergistically integrates transfer learning and active learning: it leverages labeled logs from mature systems to mitigate cold-start issues, and introduces a joint sampling strategy combining free-energy estimation and uncertainty quantification to precisely identify boundary-region samples, thereby minimizing annotation effort. The framework jointly incorporates log representation learning, domain alignment, and adaptive thresholding. Evaluated on six cross-system datasets, LogAction achieves an average F1-score of 93.01% using only 2% labeled samples—outperforming baseline methods by 26.28% in F1.

Technology Category

Application Category

📝 Abstract
Log-based anomaly detection is a essential task for ensuring the reliability and performance of software systems. However, the performance of existing anomaly detection methods heavily relies on labeling, while labeling a large volume of logs is highly challenging. To address this issue, many approaches based on transfer learning and active learning have been proposed. Nevertheless, their effectiveness is hindered by issues such as the gap between source and target system data distributions and cold-start problems. In this paper, we propose LogAction, a novel log-based anomaly detection model based on active domain adaptation. LogAction integrates transfer learning and active learning techniques. On one hand, it uses labeled data from a mature system to train a base model, mitigating the cold-start issue in active learning. On the other hand, LogAction utilize free energy-based sampling and uncertainty-based sampling to select logs located at the distribution boundaries for manual labeling, thus addresses the data distribution gap in transfer learning with minimal human labeling efforts. Experimental results on six different combinations of datasets demonstrate that LogAction achieves an average 93.01% F1 score with only 2% of manual labels, outperforming some state-of-the-art methods by 26.28%. Website: https://logaction.github.io
Problem

Research questions and friction points this paper is trying to address.

Addresses log-based anomaly detection with limited labeled data
Mitigates data distribution gaps between source and target systems
Reduces manual labeling efforts through active domain adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Active domain adaptation combining transfer and active learning
Base model trained with mature system data to mitigate cold-start
Free energy and uncertainty sampling for efficient log selection
🔎 Similar Papers
No similar papers found.
Chiming Duan
Chiming Duan
Peking University
AI4SEAIOps
Minghua He
Minghua He
Peking University
Large Language ModelSoftware Reliability
Pei Xiao
Pei Xiao
University of Surrey
wireless communications
Tong Jia
Tong Jia
Peking University
AIOpsAnomaly DetectionLog AnalysisAI for Medical Research
X
Xin Zhang
Bytedance, Beijing, China
Z
Zhewei Zhong
Bytedance, Beijing, China
Xiang Luo
Xiang Luo
Nanjing University
Natural Language ProcessingTask-Oriented Dialogue
Y
Yan Niu
Bytedance, Beijing, China
Lingzhe Zhang
Lingzhe Zhang
Peking University
AIOpsReinforcement Fine-TuningLSM
Y
Yifan Wu
School of Software and Microelectronics, Peking University, Beijing, China
Siyu Yu
Siyu Yu
Ph.D. student at Peking University
AIOpsLog analysis
W
Weijie Hong
School of Software and Microelectronics, Peking University, Beijing, China
Y
Ying Li
Institute for Artificial Intelligence, Peking University, Beijing, China
G
Gang Huang
National Key Laboratory of Data Space Technology and System, Beijing, China