🤖 AI Summary
Accurate detection of Android third-party libraries (TPLs) is critical for vulnerability tracking, malware analysis, and supply-chain auditing; however, the real-world effectiveness of existing TPL detection tools remains poorly understood.
Method: We construct the first large-scale, manually curated benchmark dataset—featuring precise version-level annotations for both remote and local dependencies—and systematically evaluate 10 state-of-the-art TPL detectors across R8 obfuscation robustness, version identification accuracy, and scalability. Our empirical study analyzes over 6,000 Android apps using static analysis, similarity matching, and version-aware dependency resolution.
Contribution/Results: We uncover pervasive limitations—including high obfuscation sensitivity, frequent version misidentification, and prohibitive resource overhead—and quantify their detrimental impact on downstream security tasks. Crucially, we establish interpretable, empirically grounded links between TPL characteristics (e.g., packaging style, obfuscation resilience) and detector performance, providing a foundational benchmark and actionable insights for developing robust, fine-grained TPL detection methods.
📝 Abstract
Accurate detection of third-party libraries (TPLs) is fundamental to Android security, supporting vulnerability tracking, malware detection, and supply chain auditing. Despite many proposed tools, their real-world effectiveness remains unclear.We present the first large-scale empirical study of ten state-of-the-art TPL detection techniques across over 6,000 apps, enabled by a new ground truth dataset with precise version-level annotations for both remote and local dependencies.Our evaluation exposes tool fragility to R8-era transformations, weak version discrimination, inaccurate correspondence of candidate libraries, difficulty in generalizing similarity thresholds, and prohibitive runtime/memory overheads at scale.Beyond tool assessment, we further analyze how TPLs shape downstream tasks, including vulnerability analysis, malware detection, secret leakage assessment, and LLM-based evaluation. From this perspective, our study provides concrete insights into how TPL characteristics affect these tasks and informs future improvements in security analysis.