CARL: Causality-guided Architecture Representation Learning for an Interpretable Performance Predictor

📅 2025-06-04

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Existing neural architecture search (NAS) performance predictors suffer from poor generalization due to spurious correlations induced by train-test distribution shifts. To address this, we propose a causality-guided architecture representation learning framework that explicitly disentangles causal (critical) from non-causal (redundant) features—marking the first such effort in NAS—by integrating substructure-aware modeling and causal intervention-based sampling. Our method comprises four components: architecture substructure modeling, intervention-driven sample generation, latent-space feature disentanglement, and performance regression modeling. Evaluated across five mainstream NAS search spaces, it achieves state-of-the-art performance: 97.67% top-1 accuracy on CIFAR-10 within the DARTS space. Crucially, it significantly improves cross-distribution generalization and provides enhanced causal interpretability of architectural features.

Technology Category

Application Category

📝 Abstract

Performance predictors have emerged as a promising method to accelerate the evaluation stage of neural architecture search (NAS). These predictors estimate the performance of unseen architectures by learning from the correlation between a small set of trained architectures and their performance. However, most existing predictors ignore the inherent distribution shift between limited training samples and diverse test samples. Hence, they tend to learn spurious correlations as shortcuts to predictions, leading to poor generalization. To address this, we propose a Causality-guided Architecture Representation Learning (CARL) method aiming to separate critical (causal) and redundant (non-causal) features of architectures for generalizable architecture performance prediction. Specifically, we employ a substructure extractor to split the input architecture into critical and redundant substructures in the latent space. Then, we generate multiple interventional samples by pairing critical representations with diverse redundant representations to prioritize critical features. Extensive experiments on five NAS search spaces demonstrate the state-of-the-art accuracy and superior interpretability of CARL. For instance, CARL achieves 97.67% top-1 accuracy on CIFAR-10 using DARTS.

Problem

Research questions and friction points this paper is trying to address.

Addresses poor generalization in NAS performance predictors

Separates causal and non-causal architectural features

Enhances interpretability of performance prediction models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causality-guided feature separation for NAS

Substructure extractor splits critical and redundant features

Interventional samples prioritize critical architecture features

🔎 Similar Papers

No similar papers found.