Identifying Context: Correlation vs. Causation
Explore how to interpret correlations in data while recognizing that correlation does not imply causation. Learn to identify confounding variables and consider additional contextual factors to craft accurate and responsible narratives from data. Understand best practices for inferring potential causes and how to communicate assumptions clearly in your data storytelling.
It can be challenging to tell a story without knowing the full context.
"Correlation is not causation" is a saying in the data science domain to emphasize that relationships found in the data using correlation analysis do not imply that one event/variable causes another.
Confounding variables, or variables that were not considered during data analysis, can have an impact on the causal context and therefore complicate data storytelling because they reduce the full context available to a data storyteller. However, investigating these relationships can also help us mine interesting insights from our data.
Correlations and causal context
Let's consider a few case studies on correlations with some causal context behind them. Our objective with this analysis is to go beyond saying there is a strong positive/weak negative correlation and identify potential causal links between the variables.