🤖 AI Summary
This work addresses the limitations of traditional protocol fingerprinting, which relies on the closed-world assumption and struggles with unknown implementations, often leading to misclassification without awareness of model incompleteness. For the first time, the open-world setting is introduced to this domain, and an incremental fingerprinting approach is proposed that integrates active automata learning with existing fingerprint models. The method efficiently determines whether a protocol implementation matches any known model in open environments; if not, it incrementally learns a new model by leveraging the structure of existing ones. Theoretical analysis establishes the correctness of the approach and demonstrates its lower asymptotic complexity. Experimental results across multiple protocols show a significant reduction in misclassification rates and fewer interactions with black-box systems compared to prior methods.
📝 Abstract
Network protocol fingerprinting is used to identify a protocol implementation by analyzing its input-output behavior. Traditionally, fingerprinting operates under a closed-world assumption, where models of all implementations are assumed to be available. However, this assumption is unrealistic in practice. When this assumption does not hold, fingerprinting results in numerous misclassifications without indicating that a model for an implementation is missing. Therefore, we introduce an open-world variant of the fingerprinting problem, where not all models are known in advance. We propose an incremental fingerprinting approach to solve the problem by combining active automata learning with closed-world fingerprinting. Our approach quickly determines whether the implementation under consideration matches an available model using fingerprinting and conformance checking. If no match is found, it learns a new model by exploiting the structure of available models. We prove the correctness of our approach and improvements in asymptotic complexity compared to naive baselines. Moreover, experimental results on a variety of protocols demonstrate a significant reduction in misclassifications and interactions with these black-boxes.