🤖 AI Summary
Generating photorealistic directed acyclic graphs (DAGs) for hardware synthesis and compiler optimization remains challenging—particularly when jointly modeling directional constraints and logical dependencies. Method: This paper proposes a hierarchical autoregressive diffusion model that encodes the DAG’s partial order as a sequence of layered bipartite graphs, thereby decoupling global dependencies into locally tractable bipartite relations. Directionality is captured via autoregression, while long-range logical constraints are modeled through diffusion-based denoising. Contribution/Results: The method achieves 100% DAG validity for graphs with up to 400 nodes and generates graphs whose statistical properties closely match those of real computational flow graphs. In cross-platform performance prediction tasks, it reduces the average error of ML-based surrogate models by 23.6%, effectively overcoming expressivity and generalization bottlenecks in large-scale DAG generation.
📝 Abstract
Directed acyclic graphs (DAGs) serve as crucial data representations in domains such as hardware synthesis and compiler/program optimization for computing systems. DAG generative models facilitate the creation of synthetic DAGs, which can be used for benchmarking computing systems while preserving intellectual property. However, generating realistic DAGs is challenging due to their inherent directional and logical dependencies. This paper introduces LayerDAG, an autoregressive diffusion model, to address these challenges. LayerDAG decouples the strong node dependencies into manageable units that can be processed sequentially. By interpreting the partial order of nodes as a sequence of bipartite graphs, LayerDAG leverages autoregressive generation to model directional dependencies and employs diffusion models to capture logical dependencies within each bipartite graph. Comparative analyses demonstrate that LayerDAG outperforms existing DAG generative models in both expressiveness and generalization, particularly for generating large-scale DAGs with up to 400 nodes-a critical scenario for system benchmarking. Extensive experiments on both synthetic and real-world flow graphs from various computing platforms show that LayerDAG generates valid DAGs with superior statistical properties and benchmarking performance. The synthetic DAGs generated by LayerDAG enhance the training of ML-based surrogate models, resulting in improved accuracy in predicting performance metrics of real-world DAGs across diverse computing platforms.