🤖 AI Summary
This work investigates the intrinsic asymmetry in predictor fusion under causal versus anticausal directions and its impact on generalization. We analyze a minimal model comprising binary classification with two continuous predictors. Our analysis reveals that optimal fusion reduces to logistic regression in the causal direction, whereas it becomes equivalent to linear discriminant analysis (LDA) in the anticausal direction. Leveraging causal maximum entropy (CMAXENT) modeling and bivariate distributional constraints, we rigorously prove the closed-form equivalence of both solutions under full observability, while quantifying fundamental geometric differences in their decision boundaries. Theoretically, we establish for the first time that this asymmetry confers superior and more predictable out-of-variable (OOV) generalization performance to anticausal fusion—providing novel insights and formal foundations for causal representation learning and robust prediction.
📝 Abstract
We study the differences arising from merging predictors in the causal and anticausal directions using the same data. In particular we study the asymmetries that arise in a simple model where we merge the predictors using one binary variable as target and two continuous variables as predictors. We use Causal Maximum Entropy (CMAXENT) as inductive bias to merge the predictors, however, we expect similar differences to hold also when we use other merging methods that take into account asymmetries between cause and effect. We show that if we observe all bivariate distributions, the CMAXENT solution reduces to a logistic regression in the causal direction and Linear Discriminant Analysis (LDA) in the anticausal direction. Furthermore, we study how the decision boundaries of these two solutions differ whenever we observe only some of the bivariate distributions implications for Out-Of-Variable (OOV) generalisation.