🤖 AI Summary
Addressing three key challenges in cross-system log anomaly detection—high annotation cost, severe source-to-target domain distribution shift, and cold-start difficulty—this paper proposes LogAction, a lightweight detection framework based on active domain adaptation. Methodologically, LogAction synergistically integrates transfer learning and active learning: it leverages labeled logs from mature systems to mitigate cold-start issues, and introduces a joint sampling strategy combining free-energy estimation and uncertainty quantification to precisely identify boundary-region samples, thereby minimizing annotation effort. The framework jointly incorporates log representation learning, domain alignment, and adaptive thresholding. Evaluated on six cross-system datasets, LogAction achieves an average F1-score of 93.01% using only 2% labeled samples—outperforming baseline methods by 26.28% in F1.
📝 Abstract
Log-based anomaly detection is a essential task for ensuring the reliability and performance of software systems. However, the performance of existing anomaly detection methods heavily relies on labeling, while labeling a large volume of logs is highly challenging. To address this issue, many approaches based on transfer learning and active learning have been proposed. Nevertheless, their effectiveness is hindered by issues such as the gap between source and target system data distributions and cold-start problems. In this paper, we propose LogAction, a novel log-based anomaly detection model based on active domain adaptation. LogAction integrates transfer learning and active learning techniques. On one hand, it uses labeled data from a mature system to train a base model, mitigating the cold-start issue in active learning. On the other hand, LogAction utilize free energy-based sampling and uncertainty-based sampling to select logs located at the distribution boundaries for manual labeling, thus addresses the data distribution gap in transfer learning with minimal human labeling efforts. Experimental results on six different combinations of datasets demonstrate that LogAction achieves an average 93.01% F1 score with only 2% of manual labels, outperforming some state-of-the-art methods by 26.28%. Website: https://logaction.github.io