Fuzzing the PHP Interpreter via Dataflow Fusion

📅 2024-10-29
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses pervasive memory-safety vulnerabilities—threatening confidentiality, integrity, and availability—in the C implementation of the PHP interpreter. We propose FlowFusion, the first dataflow-driven automated fuzzing framework for PHP. Its core innovation is a novel multi-testcase fusion mechanism grounded in program dataflow analysis, synergistically integrating interface-aware fuzzing, environment-cross mutation, and coverage-guided feedback to generate semantically richer inputs. Evaluation uncovers 158 previously unknown memory vulnerabilities (125 patched, 11 confirmed), significantly outperforming state-of-the-art fuzzers; it achieves a 24% higher code coverage than AFL++ and Polyglot. FlowFusion has been officially integrated into the PHP toolchain, enabling the first systematic discovery and characterization of deep-seated memory-safety flaws in the PHP interpreter.

Technology Category

Application Category

📝 Abstract
PHP, a dominant scripting language in web development, powers a vast range of websites, from personal blogs to major platforms. While existing research primarily focuses on PHP application-level security issues like code injection, memory errors within the PHP interpreter have been largely overlooked. These memory errors, prevalent due to the PHP interpreter's extensive C codebase, pose significant risks to the confidentiality, integrity, and availability of PHP servers. This paper introduces FlowFusion, the first automatic fuzzing framework to detect memory errors in the PHP interpreter. FlowFusion leverages dataflow as an efficient representation of test cases maintained by PHP developers, merging two or more test cases to produce fused test cases with more complex code semantics. Moreover, FlowFusion employs strategies such as test mutation, interface fuzzing, and environment crossover to increase bug finding. In our evaluation, FlowFusion found 158 unknown bugs in the PHP interpreter, with 125 fixed and 11 confirmed. Comparing FlowFusion against the official test suite and a naive test concatenation approach, FlowFusion can detect new bugs that these methods miss, while also achieving greater code coverage. FlowFusion also outperformed state-of-the-art fuzzers AFL++ and Polyglot, covering 24% more lines of code after 24 hours of fuzzing. FlowFusion has gained wide recognition among PHP developers and is now integrated into the official PHP toolchain.
Problem

Research questions and friction points this paper is trying to address.

Detects memory errors in PHP interpreter
Leverages dataflow for complex test cases
Outperforms existing fuzzers in code coverage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dataflow fusion for fuzzing
Test mutation strategy
Increased PHP code coverage
🔎 Similar Papers
No similar papers found.