Graph Construction and Matching for Imperative Programs using Neural and Structural Methods

📅 2026-04-29

📈 Citations: 0

✨ Influential: 0

career value

144K/year

🤖 AI Summary

This work addresses the challenge of identifying structural and semantic similarities across imperative programs written in different languages by proposing a unified graph representation that integrates abstract syntax trees with neural semantic embeddings. The approach transforms annotated programs into typed, attributed graphs and leverages CodeBERT and SentenceTransformer to generate rich semantic embeddings. By constructing consistent graph representations across multilingual verification datasets—including C/ACSL, Java/JML, and Dafny—it achieves, for the first time, joint modeling of syntactic structure and formal semantics. This unified framework offers a viable pathway for cross-language reuse of verification artifacts and demonstrates strong generality and effectiveness across diverse programming languages and specification frameworks.

📝 Abstract

Reusing verification artefacts requires identifying structural and semantic similarities across programs and their specifications. In this paper, we focus on graph construction as a foundational step toward this goal. We present a pipeline that converts imperative programs and their annotations into typed, attributed graphs. Our experiments cover datasets including C with ACSL, Java with JML, and Dafny for C\#. The pipeline integrates abstract syntax tree parsing with semantic embeddings derived from models such as SentenceTransformer and CodeBERT. This enables the generation of graph representations that capture both structural relationships and semantic context. Our results show that consistent graph representations can be constructed across different languages and annotation styles. This work provides a practical basis for future steps in semantic enrichment and approximate graph matching for scalable verification artefact reuse.

Problem

Research questions and friction points this paper is trying to address.

graph construction

verification artefact reuse

semantic similarity

imperative programs

program representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph construction

semantic embedding

abstract syntax tree