Large Language Model powered Symbolic Execution

📅 2025-04-02

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Existing large language models (LLMs) face significant challenges in direct symbolic execution—including low precision, high computational overhead, and strong dependence on large-scale models and high-end hardware—hindering practical deployment. To address these limitations, we propose a novel LLM-driven lightweight symbolic execution paradigm. Our approach employs path-guided task decomposition to decouple complex program analysis into fine-grained, resource-efficient subtasks. We introduce the first path-constraint generalization method based on universal code representations—rather than restricted formal languages—enabling language-agnostic constraint modeling. We further implement AutoExe, a lightweight LLM-native symbolic execution engine. Experimental results demonstrate that our method substantially improves both analysis accuracy and path-exploration scalability for small-scale LLMs running on consumer-grade hardware, matching the performance of traditional symbolic execution tools. To the best of our knowledge, this is the first work achieving highly accessible and broadly generalizable LLM-native symbolic execution.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have emerged as a promising alternative to traditional static program analysis methods, such as symbolic execution, offering the ability to reason over code directly without relying on theorem provers or SMT solvers. However, LLMs are also inherently probabilistic by nature, and therefore face significant challenges in relation to the accuracy and scale of the analysis in real-world application. Such issues often necessitate the use of larger LLMs with higher token limits, but this requires enterprise-grade hardware (GPUs) and thus limits accessibility for many users. In this paper, we propose LLM-based symbolic execution -- a novel approach that enhances LLM inference via a path-based decomposition of the program analysis tasks into smaller (more tractable) sub-tasks. The core idea is to generalize path constraints using a generic code-based representation that the LLM can directly reason over, and without translation into another (less-expressive) formal language. We implement our approach in the form of AutoExe, an LLM-based symbolic execution engine that is lightweight and language-agnostic, making it a practical tool for analyzing code that is challenging for traditional approaches. We show that AutoExe can improve both the accuracy and scale of LLM-based program analysis, especially for smaller LLMs that can run on consumer grade hardware.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM-based program analysis accuracy and scale

Reducing hardware requirements for symbolic execution tasks

Generalizing path constraints without formal language translation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Path-based decomposition for tractable subtasks

Generic code representation without formal translation

Lightweight language-agnostic symbolic execution engine

🔎 Similar Papers

No similar papers found.