Learning Gaussian DAG Models without Condition Number Bounds

📅 2025-11-08
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the problem of learning the topological structure of directed acyclic graph (DAG) Gaussian graphical models under the equal-variance assumption. Existing methods suffer from sample complexity that grows polynomially with the condition number of the covariance matrix, limiting their applicability in high-dimensional settings. To overcome this bottleneck, we propose the first polynomial-time algorithm whose sample complexity is independent of the condition number. Our method depends only on the maximum in-degree $d$ and $log n$ (where $n$ is the number of nodes), achieving a nearly tight theoretical bound. It leverages structural properties of equal-variance DAGs, integrating statistical hypothesis testing with graph-structure recovery techniques. Rigorous upper-bound analysis and an information-theoretic lower-bound construction establish its theoretical guarantees. Experiments on synthetic data confirm the theoretical predictions and demonstrate substantial improvements over baseline methods that are sensitive to the condition number.

Technology Category

Application Category

📝 Abstract
We study the problem of learning the topology of a directed Gaussian Graphical Model under the equal-variance assumption, where the graph has $n$ nodes and maximum in-degree $d$. Prior work has established that $O(d log n)$ samples are sufficient for this task. However, an important factor that is often overlooked in these analyses is the dependence on the condition number of the covariance matrix of the model. Indeed, all algorithms from prior work require a number of samples that grows polynomially with this condition number. In many cases this is unsatisfactory, since the condition number could grow polynomially with $n$, rendering these prior approaches impractical in high-dimensional settings. In this work, we provide an algorithm that recovers the underlying graph and prove that the number of samples required is independent of the condition number. Furthermore, we establish lower bounds that nearly match the upper bound up to a $d$-factor, thus providing an almost tight characterization of the true sample complexity of the problem. Moreover, under a further assumption that all the variances of the variables are bounded, we design a polynomial-time algorithm that recovers the underlying graph, at the cost of an additional polynomial dependence of the sample complexity on $d$. We complement our theoretical findings with simulations on synthetic datasets that confirm our predictions.
Problem

Research questions and friction points this paper is trying to address.

Learning Gaussian DAG topology under equal-variance assumption
Eliminating sample complexity dependence on condition number
Providing nearly tight characterization of true sample complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Algorithm recovers graph without condition number bounds
Sample complexity independent of covariance condition number
Polynomial-time algorithm for bounded variance settings
🔎 Similar Papers
No similar papers found.
Constantinos Daskalakis
Constantinos Daskalakis
Professor of Computer Science, MIT
theoretical computer scienceeconomicsprobability theorylearningstatistics
V
Vardis Kandiros
Data Science Institute, Columbia University
R
Rui Yao
EECS Department, MIT