GraphMend: Code Transformations for Fixing Graph Breaks in PyTorch 2

📅 2025-09-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In PyTorch 2, FX graph breaks—caused primarily by dynamic control flow and unsupported Python I/O operations—fragment computational graphs, trigger frequent fallbacks to eager mode, exacerbate CPU-GPU synchronization overhead, and hinder graph-level optimizations. Method: We propose a compiler frontend source-to-source transformation technique that automatically analyzes and rewrites Python code prior to TorchDynamo’s graph capture. Leveraging the Jac framework, our approach integrates static analysis with targeted source rewriting to eliminate *repairable* graph breaks, enabling seamless interoperability between TorchDynamo and TorchInductor. Contribution/Results: Evaluated on eight Hugging Face models, our method achieves zero graph breaks for six models, reduces inference latency by up to 75%, and improves end-to-end throughput by up to 8%. This work is the first to systematically incorporate source-level transformations into the PyTorch JIT compilation pipeline, significantly expanding both the scope of compilable graphs and the applicability of downstream optimizations.

Technology Category

Application Category

📝 Abstract
This paper presents GraphMend, a high-level compiler that eliminates FX graph breaks in PyTorch 2 programs. Although PyTorch 2 introduced TorchDynamo and TorchInductor to enable just-in-time graph compilation, unresolved dynamic control flow and unsupported Python constructs often fragment models into multiple FX graphs. These fragments force frequent fallbacks to eager mode, incur costly CPU-to-GPU synchronizations, and reduce optimization opportunities. GraphMend addresses this limitation by analyzing and transforming source code before execution. Built on the Jac compilation framework, GraphMend introduces two code transformations that remove graph breaks due to dynamic control flow and Python I/O functions. This design allows PyTorch's compilation pipeline to capture larger, uninterrupted FX graphs without requiring manual refactoring by developers. Evaluation across eight Hugging Face models shows that GraphMend removes all fixable graph breaks due to dynamic control flow and Python I/O functions, driving the break count to 0 in 6 models and reducing it from 5 to 2 in another model. On NVIDIA RTX 3090 and A40 GPUs, GraphMend achieves up to 75% latency reductions and up to 8% higher end-to-end throughput. These results demonstrate that high-level code transformation is an effective complement to PyTorch's dynamic JIT compilation pipeline, substantially improving both usability and performance.
Problem

Research questions and friction points this paper is trying to address.

Eliminating FX graph breaks in PyTorch 2 programs caused by dynamic control flow
Removing graph fragmentation that forces frequent fallbacks to eager execution mode
Addressing unsupported Python constructs that reduce optimization opportunities during compilation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing and transforming source code before execution
Removing graph breaks from dynamic control flow
Eliminating graph breaks caused by Python I/O functions
🔎 Similar Papers
No similar papers found.
S
Savini Kashmira
University of Michigan, Ann Arbor, USA
J
Jayanaka Dantanarayana
University of Michigan, Ann Arbor, USA
T
Thamirawaran Sathiyalogeswaran
Jaseci Labs, Ann Arbor, USA
Y
Yichao Yuan
University of Michigan, Ann Arbor, USA
Nishil Talati
Nishil Talati
Assistant Research Scientist, University of Michigan
Computer ArchitectureSystemsGenerative AIData Analytics
K
Krisztian Flautner
University of Michigan, Ann Arbor, USA
Lingjia Tang
Lingjia Tang
University of Michigan
Computer systemNLPAI/MLDatacenter Efficiency
Jason Mars
Jason Mars
Professor of Computer Science and Engineering, University of Michigan
Computer Architecture - Runtime Systems - Compilers - Emerging Cloud/Mobile Platforms