FlexFringe: Modeling Software Behavior by Learning Probabilistic Automata

📅 2022-03-28

🏛️ arXiv.org

📈 Citations: 7

✨ Influential: 1

career value

186K/year

🤖 AI Summary

This work addresses log-based anomaly detection in software systems. To overcome the poor interpretability and high deployment overhead of existing neural network models, we propose a behavioral modeling approach based on Probabilistic Deterministic Finite Automata (PDFA). Our method introduces an enhanced state-merging strategy that enables flexible trade-offs between model interpretability and compactness—yielding either human-readable explicit state-transition models or highly compressed, high-accuracy variants. By integrating probabilistic automaton learning, statistical hypothesis testing, and generalization optimization, our framework achieves state-of-the-art modeling accuracy across diverse real-world software logs. In anomaly detection, it attains significantly higher F1-scores than mainstream neural network baselines. Crucially, its compact model variant maintains both efficient inference latency and practical robustness, making it suitable for production deployment.

📝 Abstract

We present the efficient implementations of probabilistic deterministic finite automaton learning methods available in FlexFringe. These implement well-known strategies for state-merging including several modifications to improve their performance in practice. We show experimentally that these algorithms obtain competitive results and significant improvements over a default implementation. We also demonstrate how to use FlexFringe to learn interpretable models from software logs and use these for anomaly detection. Although less interpretable, we show that learning smaller more convoluted models improves the performance of FlexFringe on anomaly detection, outperforming an existing solution based on neural nets.

Problem

Research questions and friction points this paper is trying to address.

Learning probabilistic automata from software logs

Improving state-merging strategies for better performance

Using learned models for software anomaly detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Implements efficient probabilistic automaton learning methods

Uses state-merging strategies with performance modifications

Learns interpretable models from software logs

🔎 Similar Papers

CodeFlow: Program Behavior Prediction with Dynamic Dependencies Learning