PrETi: Predicting Execution Time in Early Stage with LLVM and Machine Learning

📅 2025-03-17

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the challenge of predicting program execution time during early software development—when full binary execution is infeasible—this paper proposes a lightweight, execution-free timing prediction method. Our approach constructs an LLVM-based, hardware-aware simulation environment at the IR level to extract fine-grained, timing-sensitive features—including instruction counts, cache accesses, and branch misprediction estimates—and integrates historical performance data to train supervised learning models (e.g., XGBoost and Random Forest). Crucially, we introduce the first deep coupling of LLVM IR-level execution semantics with hardware behavior modeling (cache and branch prediction), overcoming key limitations of prior methods reliant solely on source-code static analysis or post-execution profiling. Evaluated on public benchmarks, our method achieves a mean absolute percentage error (MAPE) of 11.98%, outperforming state-of-the-art approaches; prediction time remains at the minute scale, with negligible overhead.

Technology Category

Application Category

📝 Abstract

We introduce preti, a novel framework for predicting software execution time during the early stages of development. preti leverages an LLVM-based simulation environment to extract timing-related runtime information, such as the count of executed LLVM IR instructions. This information, combined with historical execution time data, is utilized to train machine learning models for accurate time prediction. To further enhance prediction accuracy, our approach incorporates simulations of cache accesses and branch prediction. The evaluations on public benchmarks demonstrate that preti achieves an average Absolute Percentage Error (APE) of 11.98%, surpassing state-of-the-art methods. These results underscore the effectiveness and efficiency of preti as a robust solution for early-stage timing analysis.

Problem

Research questions and friction points this paper is trying to address.

Predict software execution time early in development.

Use LLVM and machine learning for accurate predictions.

Incorporate cache and branch prediction simulations for precision.

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLVM-based simulation extracts timing data

Machine learning models predict execution time

Cache and branch prediction enhance accuracy

🔎 Similar Papers

No similar papers found.