Simulation of Simpson’s Paradox With Hospital Data
Author
Andy Grogan-Kaylor
Published
July 10, 2025
1 Background
Simpson’s paradox occurs when a bivariate association is reversed in a multivariate model. This example uses simulated data from hospitals (Wang et al., 2018).
2 Setup
Show the code
clearalluse"hospitaldata.dta", clearlist
| hospital severity outcome count |
|------------------------------------------|
1. | better less severe success 18 |
2. | better less severe failure 2 |
3. | better more severe success 32 |
4. | better more severe failure 48 |
5. | normal less severe success 64 |
|------------------------------------------|
6. | normal less severe failure 16 |
7. | normal more severe success 4 |
8. | normal more severe failure 16 |
+------------------------------------------+
3 Outcome By Hospital Type
It appears as though patients do better at the normal hospital, as opposed to the better hospital.