The Deceiving Statistics

How to prevent getting exploited by the data?

Pratik Tawade
Science For Life
2 min readMar 13, 2022

--

Organized data has much importance as many organizations, research, and nations decide by interpretation of this data. But there exists a problem with that; any set of statistics might have something lurking inside. Something which can change the results totally, upside down.

For example, imagine that you need to choose a hospital for a relative’s surgery from Hospitals A and B. Data so far shows, out of 1000 patients admitted in Hospitals, 900 survived (900/1000) in hospital A while 800 survived (800/1000) in hospital B. So, it looks like Hospital A is a better choice.

But remember, not all patients arrive at the hospital with the same level of health. If we divide the hospital’s last one thousand patients in good and poor health, the picture seems completely different

Hospital A had only 100 patients who arrived in poor health, out of which only 30 survived (30/100), but hospital B had 400, and they were able to save 210 (210/400). So, hospital B is a better choice for people arriving in poor health with a survival rate of 52.5%.

And if the health of the patient arriving in hospital B is good, then, strangely enough, Hospital B is still the better choice with a survival rate of over 98%. So, how can Hospital A have an overall survival rate if Hospital B has a better survival rate for patients in each of the two groups??

What we stumbled upon is Simpson’s Paradox. Simpson’s paradox occurs when data groups show one particular trend, but this trend is reversed when the groups are combined together. This often happens when aggregated data hides a conditional variable, which is a hidden factor that significantly influences data.

So how do we avoid falling for the paradox?

Data can be grouped in many ways, and overall numbers can sometimes give a more accurate picture than data divided into various categories. All we can do is carefully study the actual situation statistics describes and consider whether the lurking variable may be present. Otherwise, we will be vulnerable to being exploited, and others may manipulate and use data to achieve their agenda.

“Never leave a number all by itself. Never believe that one number on its own can be meaningful. If you are offered one number, always ask for at least one more. Something to compare it with.”

~ Hans Rosling

That’s a wrap — Thank you for reading!

To know more don’t hesitate to visit:

--

--

Pratik Tawade
Science For Life

I am a Science communicator. I like to explain complex concepts in easy to understand language with relatable examplse. Support me: https://ko-fi.com/pratikt