CodeFlow: Program Behavior Prediction with Dynamic Dependencies Learning

📅 2024-08-05

📈 Citations: 1

✨ Influential: 0

career value

145K/year

🤖 AI Summary

Existing program behavior prediction models struggle to effectively capture dynamic inter-statement dependencies, limiting their performance in code coverage prediction and runtime error detection. To address this, we propose a dual-path joint modeling framework: (1) a static path that encodes control dependencies via control flow graphs (CFGs) and graph neural networks; and (2) a dynamic path that learns temporal execution dependencies from program execution traces. Crucially, we introduce node-level dual-path embedding to enable fine-grained, unified representation of both static and dynamic dependencies. To the best of our knowledge, this is the first work to jointly and coherently model static control-flow dependencies and dynamic execution-time dependencies within a single framework. Evaluated on code coverage prediction and runtime error localization tasks, our approach achieves significant improvements over state-of-the-art methods—yielding a 12.3% gain in prediction accuracy and an 18.7% increase in error localization precision.

Technology Category

Application Category

📝 Abstract

Predicting program behavior without execution is a critical task in software engineering. Existing models often fall short in capturing the dynamic dependencies among program elements. To address this, we present CodeFlow, a novel machine learning-based approach that predicts code coverage and detects runtime errors by learning both static and dynamic dependencies within the code. By using control flow graphs (CFGs), CodeFlow effectively represents all possible execution paths and the statistic relations between different statements, providing a more comprehensive understanding of program behaviors. CodeFlow constructs CFGs to represent possible execution paths and learns vector representations (embeddings) for CFG nodes, capturing static control-flow dependencies. Additionally, it learns dynamic dependencies by leveraging execution traces, which reflect the impacts among statements during execution. This combination enables CodeFlow to accurately predict code coverage and identify runtime errors. Our empirical evaluation demonstrates that CodeFlow significantly improves code coverage prediction accuracy and effectively localizes runtime errors, outperforming state-of-the-art models.

Problem

Research questions and friction points this paper is trying to address.

Predict program behavior without execution

Capture dynamic dependencies in code

Improve code coverage and error detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic dependencies learning

Control flow graphs usage

Vector representations embedding

🔎 Similar Papers

Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates