🤖 AI Summary
This paper identifies a fundamental flaw in Pearson’s chi-square test for testing proportion homogeneity (i.e., equality of proportions across groups in contingency tables): its test statistic lacks scale invariance—scaling all cell frequencies by a common factor linearly alters the statistic, thereby permitting artificial manipulation of inference outcomes via sample size rescaling, violating foundational principles of statistical inference. Method: The authors provide the first rigorous proof that this non-invariance induces test invalidity and formally establish scale invariance as a necessary condition for valid homogeneity testing. Leveraging invariance principles and formal contingency table modeling, they reconstruct the methodological foundation for proportion homogeneity testing. Contribution/Results: The work proposes a theoretically grounded, scale-invariant alternative framework. It serves as a critical caution against uncritical application of classical chi-square tests for proportion comparisons in medicine, social sciences, and related fields, and advances the development of scale-invariant statistical tests.
📝 Abstract
Pearson's chi-square tests are among the most commonly applied statistical tools across a wide range of scientific disciplines, including medicine, engineering, biology, sociology, marketing and business. However, its usage in some areas is not correct. For example, the chi-square test for homogeneity of proportions (that is, comparing proportions across groups in a contingency table) is frequently used to verify if the rows of a given nonnegative $m imes n$ (contingency) matrix $A$ are proportional. The null-hypothesis $H_0$: ``$m$ rows are proportional'' (for the whole population) is rejected with confidence level $1 - alpha$ if and only if $chi^2_{stat}>chi^2_{crit}$, where the first term is given by Pearson's formula, while the second one depends only on $m, n$, and $alpha$, but not on the entries of $A$. It is immediate to notice that the Pearson's formula is not invariant. More precisely, whenever we multiply all entries of $A$ by a constant $c$, the value $chi^2_{stat}(A)$ is multiplied by $c$, too, $chi^2_{stat}(cA) = c chi^2_{stat} (A)$. Thus, if all rows of $A$ are exactly proportional then $chi^2_{stat}(cA) = c chi^2_{stat}(A) = 0$ for any $c$ and any $alpha$. Otherwise, $chi^2_{stat} (cA)$ becomes arbitrary large or small, as positive $c$ is increasing or decreasing. Hence, at any fixed significance level $alpha$, the null hypothesis $H_0$ will be rejected with confidence $1 - alpha$, when $c$ is sufficiently large and not rejected when $c$ is sufficiently small, Yet, obviously, the rows of $cA$ should be proportional or not for all $c$ simultaneously. Thus, any reasonable formula for the test statistic must be invariant, that is, take the same value for matrices $cA$ for all real positive $c$. KEY WORDS: Pearson chi-square test, difference between two proportions, goodness of fit, contingency tables.