APISENSOR: Robust Discovery of Web API from Runtime Traffic Logs

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the poor robustness of existing API discovery methods under mixed runtime traffic—such as scenarios where multiple applications share a common observation point—where static approaches suffer from high false-positive rates due to source code dependencies, and dynamic black-box techniques exhibit degraded accuracy in complex environments. To overcome these limitations, we propose APISENSOR, an unsupervised black-box framework that accurately reconstructs Web APIs from mixed traffic through traffic denoising and normalization, graph-based structural modeling, and a two-stage clustering strategy. APISENSOR is the first approach to achieve high robustness in automatic API discovery under mixed traffic, significantly improving both precision and stability while also uncovering inconsistencies in official API documentation. Evaluations on over 10,000 requests across six real-world applications demonstrate an average cluster purity of 95.92% and an F1-score of 94.91%, with the lowest performance variance, substantially outperforming ten baseline methods.

Technology Category

Application Category

📝 Abstract
Large Language Model (LLM)-based agents increasingly rely on APIs to operate complex web applications, but rapid evolution often leads to incomplete or inconsistent API documentation. Existing work falls into two categories: (1) static, white-box approaches based on source code or formal specifications, and (2) dynamic, black-box approaches that infer APIs from runtime traffic. Static approaches rely on internal artifacts, which are typically unavailable for closed-source systems, and often over-approximate API usage, resulting in high false-positive rates. Although dynamic black-box API discovery applies broadly, its robustness degrades in complex environments where shared collection points aggregate traffic from multiple applications. To improve robustness under mixed runtime traffic, we propose APISENSOR, a black-box API discovery framework that reconstructs application APIs unsupervised. APISENSOR performs structured analysis over complex traffic, combining traffic denoising and normalization with a graph-based two-stage clustering process to recover accurate APIs. We evaluated APISENSOR across six web applications using over 10,000 runtime requests with simulated mixed-traffic noise. Results demonstrate that APISENSOR significantly improves discovery accuracy, achieving an average Group Accuracy Precision of 95.92% and an F1-score of 94.91%, outperforming state-of-the-art methods. Across different applications and noise settings, APISENSOR achieves the lowest performance variance and at most an 8.11-point FGA drop, demonstrating the best robustness among 10 baselines. Ablation studies confirm that each component is essential. Furthermore, APISENSOR revealed API documentation inconsistencies in a real application, later confirmed by community developers.
Problem

Research questions and friction points this paper is trying to address.

API discovery
runtime traffic
black-box analysis
web APIs
traffic logs
Innovation

Methods, ideas, or system contributions that make the work stand out.

API discovery
runtime traffic analysis
black-box testing
graph-based clustering
LLM agent support
Y
Yanjing Yang
State Key Laboratory of Novel Software Technology, Software Institute, Nanjing University, Nanjing 210008, China
Chenxing Zhong
Chenxing Zhong
Nanjing University of Science and Technology
Software architectureSoftware maintenance and evolutionArtificial Intelligence
Ke Han
Ke Han
Southwest Jiaotong University
transportationoptimizationdata science
Z
Zeru Cheng
J
Jinwei Xu
State Key Laboratory of Novel Software Technology, Software Institute, Nanjing University, Nanjing 210008, China
Xin Zhou
Xin Zhou
Huazhong University of Science and Technology
Computer Vison3D Vision
He Zhang
He Zhang
Warsaw University of Technology
B
Bohan Liu
State Key Laboratory of Novel Software Technology, Software Institute, Nanjing University, Nanjing 210008, China