On the Security of Research Artifacts

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This study addresses the prevalent lack of security evaluation in research artifacts shared for reproducibility, which can inadvertently introduce exploitable attack surfaces. To mitigate this risk, the work proposes the first security risk taxonomy tailored to research artifacts and introduces SAFE, a context-aware automated assessment framework. SAFE integrates static code analysis with execution context and semantic information, enhanced by machine learning to filter false positives, enabling scalable and precise risk identification. Empirical evaluation reveals that 41.6% of commonly shared artifacts harbor genuine security vulnerabilities. The framework achieves an accuracy of 84.80% and an F1 score of 84.63% in detecting security risks, significantly advancing the practice of responsible and secure research sharing.

📝 Abstract

Research artifacts are widely shared to support reproducibility, and artifact evaluation (AE) has become common at many leading conferences. However, AE mainly checks whether artifacts work as claimed and can be reproduced. It largely overlooks potential security risks. Since these artifacts are publicly released and reused, they may unintentionally create opportunities for misuse and raise concerns about safe and responsible sharing. We study 509 research artifacts from top-tier security venues and find that many contain insecure code patterns that may introduce potential attack vectors. We propose a taxonomy for context-aware security assessment to enable structured analysis of such risks. We perform static analysis and examine the resulting findings, filtering false positives and identifying real security risks. Our analysis shows that 41.60% of the prevalent findings may pose security concerns under practical usage. To support scalable analysis, we introduce SAFE (Security-Aware Framework for Artifact Evaluation), a first step toward an autonomous framework that analyzes tool-reported findings by considering code semantics, execution context, and practical exploitability. SAFE achieves 84.80% accuracy and 84.63% F1-score in distinguishing security and non-security risks. Overall, our results show that security is also important in AE for promoting safe and responsible research sharing. The source code is available at: https://github.com/nanda-rani/SAFE

Problem

Research questions and friction points this paper is trying to address.

research artifacts

security risks

artifact evaluation

reproducibility

responsible sharing

Innovation

Methods, ideas, or system contributions that make the work stand out.

artifact evaluation

security risks

static analysis