🤖 AI Summary
Encrypted source code poses a fundamental tension between intellectual property protection and accurate vulnerability detection. Existing approaches struggle to simultaneously ensure strong confidentiality and high analytical fidelity.
Method: This paper introduces “confidential code analysis,” a novel paradigm that deeply integrates searchable symmetric encryption (SSE) with static program analysis, enabling precise data-flow and control-flow analysis directly over encrypted code via an encrypted reverse index.
Contribution/Results: We implement CoCoA—an open-source tool that performs high-fidelity vulnerability detection on encrypted PHP source code without revealing semantic or structural information. Evaluated on real-world applications, CoCoA achieves vulnerability detection rates comparable to plaintext-based analyzers, with only 42.7% average runtime overhead—substantially outperforming prior encrypted-domain analysis schemes. To our knowledge, CoCoA is the first systematic solution for secure outsourced code auditing that provides both provable privacy guarantees and practical analytical accuracy.
📝 Abstract
Software vulnerabilities continue to be the main cause of occurrence for cyber attacks. In an attempt to reduce them and improve software quality, software code analysis has emerged as a service offered by companies specialising in software testing. However, this service requires software companies to provide access to their software's code, which raises concerns about code privacy and intellectual property theft. This paper presents a novel approach to Software Quality and Privacy, in which testing companies can perform code analysis tasks on encrypted software code provided by software companies while code privacy is preserved. The approach combines Static Code Analysis and Searchable Symmetric Encryption in order to process the source code and build an encrypted inverted index that represents its data and control flows. The index is then used to discover vulnerabilities by carrying out static analysis tasks in a confidential way. With this approach, this paper also defines a new research field -- Confidential Code Analysis --, from which other types of code analysis tasks and approaches can be derived. We implemented the approach in a new tool called CoCoA and evaluated it experimentally with synthetic and real PHP web applications. The results show that the tool has similar precision as standard (non-confidential) static analysis tools and a modest average performance overhead of 42.7%.