Desire to solve problems is perhaps natural to all humans. The inability to identify the causes of a problem, particularly in case of the issues relevant to our personal and social lives, creates some kind of discomfort within our minds. Regardless of the difficulty of the problem and our expertise in the area, often, we come up with some cause and effect relationship (change in X causes change in Y), and then propose a solution such as: “by changing X (the cause), we can change Y (the outcome/effect).

Whether in personal or in professional lives, one of the ways we attempt to identify the causes of a problem is by finding events that concurrently happen with the problem of interest. Of course, so many things happen around us all the time — some being more easily noticeable than others — but we tend to over-emphasize the events with immediate availability and visibility. Psychologists call this phenomenon “Availability Heuristics”.

For example, in a particular neighborhood, if the number of crimes increase over a period of time and a demographic shift takes place (a more visible change) within the same time period, people may start assuming that the change in the demography resulted in the increase in criminal activities. But, is the evidence enough to make such a strong causal inference?

In this article, we will try to illustrate how the making of a causal inference based on a simple_ bivariate correlation/association_ can go horribly wrong in the presence of a confounding variable.

#spurious-correlation #causal-inference #data-science #cofounders #causality

Confounding Variable and Spurious Correlation: Key Challenge
3.30 GEEK