GraphNet: A Large-Scale Computational Graph Dataset for Tensor Compiler Research

📅 2025-10-27

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Tensor compilers lack unified, realistic, and multi-framework computational graph benchmarks. Method: We introduce GraphNet, a large-scale standardized dataset comprising 2.7K real-world computational graphs from six task categories and multiple deep learning frameworks (e.g., PyTorch, TensorFlow). We propose two novel evaluation metrics—Speedup Score (S(t)) and Error-aware Speedup Score (ES(t))—that jointly quantify execution speedup, numerical correctness, and error sensitivity for the first time. We further design a cross-framework graph extraction methodology and an automated evaluation toolchain supporting end-to-end performance assessment of mainstream compilers including CINN and TorchInductor. Contribution/Results: Empirical validation on CV and NLP tasks demonstrates GraphNet’s effectiveness in exposing optimization bottlenecks across diverse graph structures. All code, data, and tools are publicly released.

Technology Category

Application Category

📝 Abstract

We introduce GraphNet, a dataset of 2.7K real-world deep learning computational graphs with rich metadata, spanning six major task categories across multiple deep learning frameworks. To evaluate tensor compiler performance on these samples, we propose the benchmark metric Speedup Score S(t), which jointly considers runtime speedup and execution correctness under tunable tolerance levels, offering a reliable measure of general optimization capability. Furthermore, we extend S(t) to the Error-aware Speedup Score ES(t), which incorporates error information and helps compiler developers identify key performance bottlenecks. In this report, we benchmark the default tensor compilers, CINN for PaddlePaddle and TorchInductor for PyTorch, on computer vision (CV) and natural language processing (NLP) samples to demonstrate the practicality of GraphNet. The full construction pipeline with graph extraction and compiler evaluation tools is available at https://github.com/PaddlePaddle/GraphNet .

Problem

Research questions and friction points this paper is trying to address.

Provides large-scale computational graph dataset for tensor compiler research

Proposes benchmark metrics evaluating runtime speedup and execution correctness

Enables identification of performance bottlenecks in tensor compilers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale computational graph dataset for tensor compiler research

Proposed Speedup Score metric for compiler performance evaluation

Extended Error-aware Speedup Score for bottleneck identification

🔎 Similar Papers

Survey on Characterizing and Understanding GNNs from a Computer Architecture Perspective