Call Me Maybe: Enhancing JavaScript Call Graph Construction using Graph Neural Networks

📅 2025-06-22

📈 Citations: 0

✨ Influential: 0

career value

152K/year

🤖 AI Summary

JavaScript’s dynamic features render existing call graph construction methods neither sound nor complete, often introducing spurious edges or missing genuine calls. To address this, we propose the first GNN-based link prediction framework for multi-file, whole-program graphs, formulating unresolved call site resolution as a heterogeneous edge link prediction task on program graphs. Our approach innovatively integrates syntactic and semantic edges to construct richer program representations, jointly leverages static and dynamic call edges as weak supervision signals, and supports cross-project knowledge transfer. Evaluated on 50 widely used JavaScript libraries, our method achieves 42.3% top-1 accuracy in identifying the correct target function for unresolved call sites, with 72.1% of ground-truth targets ranked within the top five candidates. This substantially improves call graph precision and significantly reduces manual verification effort.

Technology Category

Application Category

📝 Abstract

Static analysis plays a key role in finding bugs, including security issues. A critical step in static analysis is building accurate call graphs that model function calls in a program. However, due to hard-to-analyze language features, existing call graph construction algorithms for JavaScript are neither sound nor complete. Prior work shows that even advanced solutions produce false edges and miss valid ones. In this work, we assist these tools by identifying missed call edges. Our main idea is to frame the problem as link prediction on full program graphs, using a rich representation with multiple edge types. Our approach, GRAPHIA, leverages recent advances in graph neural networks to model non-local relationships between code elements. Concretely, we propose representing JavaScript programs using a combination of syntactic- and semantic-based edges. GRAPHIA can learn from imperfect labels, including static call edges from existing tools and dynamic edges from tests, either from the same or different projects. Because call graphs are sparse, standard machine learning metrics like ROC are not suitable. Instead, we evaluate GRAPHIA by ranking function definitions for each unresolved call site. We conduct a large-scale evaluation on 50 popular JavaScript libraries with 163K call edges (150K static and 13K dynamic). GRAPHIA builds program graphs with 6.6M structural and 386K semantic edges. It ranks the correct target as the top candidate in over 42% of unresolved cases and within the top 5 in 72% of cases, reducing the manual effort needed for analysis. Our results show that learning-based methods can improve the recall of JavaScript call graph construction. To our knowledge, this is the first work to apply GNN-based link prediction to full multi-file program graphs for interprocedural analysis.

Problem

Research questions and friction points this paper is trying to address.

Improving JavaScript call graph accuracy using GNNs

Addressing false and missing edges in static analysis

Enhancing interprocedural analysis with multi-file program graphs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Graph Neural Networks for call edge prediction

Combines syntactic and semantic edges in program graphs

Learns from both static and dynamic call edges

🔎 Similar Papers

An Empirical Study of Large Language Models for Type and Call Graph Analysis

2024-10-01arXiv.orgCitations: 0

💼 Related Jobs

No related jobs found.

Authors to Follow