đ¤ AI Summary
This paper addresses key limitations of existing dependence measuresâsuch as the contingency coefficient, lambda, tau, and the uncertainty coefficientâincluding the lack of an intuitive definition of âperfect dependenceâ for nominal variables, inability to attain unity for mixed-type variables (especially those involving continuous components), sensitivity to marginal discretization, and systematic underestimation of strong associations. To resolve these issues, we propose a novel dependence framework grounded in conditional distribution reconstruction and probability kernels. We formally define perfect dependence for settings involving at least one nominal variable and construct a family of correlation coefficients ranging in [0,1], attaining 1 if and only if perfect dependence holds. Theoretically, we establish scale robustness and full attainability; derive the asymptotic distribution of estimators to enable statistical inference; validate finite-sample performance via simulations; and empirically uncover substantially stronger dependenciesâpreviously underestimated by conventional methodsâbetween countryâincome and religionâsocial variables.
đ Abstract
This paper develops an intuitive concept of perfect dependence between two variables of which at least one has a nominal scale that is attainable for all marginal distributions and proposes a set of dependence measures that are 1 if and only if this perfect dependence is satisfied. The advantages of these dependence measures relative to classical dependence measures like contingency coefficients, Goodman-Kruskal's lambda and tau and the so-called uncertainty coefficient are twofold. Firstly, they are defined if one of the variables is real-valued and exhibits continuities. Secondly, they satisfy the property of attainability. That is, they can take all values in the interval [0,1] irrespective of the marginals involved. Both properties are not shared by the classical dependence measures which need two discrete marginal distributions and can in some situations yield values close to 0 even though the dependence is strong or even perfect. Additionally, I provide a consistent estimator for one of the new dependence measures together with its asymptotic distribution under independence as well as in the general case. This allows to construct confidence intervals and an independence test, whose finite sample performance I subsequently examine in a simulation study. Finally, I illustrate the use of the new dependence measure in two applications on the dependence between the variables country and income or country and religion, respectively.