I am starting to think that Simpson’s Paradox (Simpson, 1951) and the Aggregation Paradox (which elsewhere I term “multilevel structure” (C.F. Gelman et al., 2007; Nieuwenhuis, 2015)), are variations of the same phenomenon.
Consider a simple regression equation.
\[y_{ij} = \beta_0 + \beta_1 x_{ij} + \beta_2 z_{ij} + u_{0j} + e_{ij}\]
- Simpson’s Paradox could possibly occur if we omit \(z_{ij}\) from the regression equation.
- The Aggregation Paradox could possibly occur if we omit \(u_{0j}\) from the regression equation.
References
Gelman, A., Shor, B., Bafumi, J., & Park, D. (2007). Rich state, poor state, red state, blue state: What’s the matter with Connecticut? Quarterly Journal of Political Science, 2, 345–367. https://doi.org/10.2139/ssrn.1010426
Nieuwenhuis, R. (2015). Association, aggregation, and paradoxes: On the positive correlation between fertility and women’s employment. Demographic Research, 32. https://www.demographic-research.org/volumes/vol32/23/
Simpson, E. H. (1951). The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society. Series B (Methodological), 13, 238–241. http://www.jstor.org/stable/2984065