🤖 AI Summary
Existing differentiable structure learning methods rely on strong generative assumptions (e.g., specific structural equation models) and linear relationships, limiting their ability to capture complex nonlinear dependencies in discrete data and hindering generalizability.
Method: We propose the first differentiable structure learning framework that imposes no strong generative assumptions, enabling flexible modeling of arbitrary higher-order dependencies among binary variables. By reformulating graph structure search as an end-to-end continuous optimization problem, we theoretically establish identifiability up to the Markov equivalence class under mild regularity conditions. Our approach jointly optimizes over parameter space and structural equivalence classes to ensure compatibility with observational data.
Contribution/Results: Experiments on synthetic and real-world discrete datasets demonstrate significant improvements over state-of-the-art methods, achieving accurate recovery of complex nonlinear causal structures.
📝 Abstract
Existing methods for differentiable structure learning in discrete data typically assume that the data are generated from specific structural equation models. However, these assumptions may not align with the true data-generating process, which limits the general applicability of such methods. Furthermore, current approaches often ignore the complex dependence structure inherent in discrete data and consider only linear effects. We propose a differentiable structure learning framework that is capable of capturing arbitrary dependencies among discrete variables. We show that although general discrete models are unidentifiable from purely observational data, it is possible to characterize the complete set of compatible parameters and structures. Additionally, we establish identifiability up to Markov equivalence under mild assumptions. We formulate the learning problem as a single differentiable optimization task in the most general form, thereby avoiding the unrealistic simplifications adopted by previous methods. Empirical results demonstrate that our approach effectively captures complex relationships in discrete data.