Simpson’s paradox
I’ve never seen Simpson’s paradox explained by showing the distributions, which I found to be very helpful in understanding what’s going on. So, take a look at these two distributions I made from calls to random()
:
The cyan distribution has a larger mean than the magenta one, but anyone worth their salt should recognize them as mixtures of unimodal distributions (indeed, that’s how I made them) and look for a binary factor to separate the subpopulations. Upon separating them, in both cases the magentas’ means are higher than the cyans’, Simpson’s paradox.
Comments Off on Simpson’s paradox