# Causal Data Science

I started a series of posts aimed at helping people learn about causality in data science (and science in general), and wanted to compile them all together here in a living index. This list will grow as I post more:

The goal of this post is to develop a basic understand of the intuition behind causal graphs. It’s aimed at a general audience, and by the end of it, you should be able to intuitively understand causal diagrams, and reason about ways that the picture might be incomplete.

2. **Understanding Bias: A Prerequisite For Trustworthy Results**

This post aims at a general audience. The goal is to understand what bias is, where it comes from, and how drawing a causal diagram can help you reason about bias.

3. **Speed vs. Accuracy: When Is Correlation Enough? When Do You Need Causation?**

The goal of this article is to understand some common errors in data analysis, and to motivate a balance of data resources to fast (correlative) and slow (causal) insights.

4. **A Technical Primer on Causality**

This is a very technical introduction to the material from the previous posts, aimed at practitioners with a background in regression analysis and probability.

5. **The Data Processing Inequality**

In order to understand observational, graphical causal inference, you need to understand “conditional independence testing”. CIT can be sensitive to how you encode your data, and it’s a problem that is sometimes swept under the rug. This article brings it into the spotlight, and is a pre-cursor to our discussion on causal inference!

If you can’t experiment on a system, is there any hope for establishing causality? In some cases, with certain assumptions (and not the usual “no latent variables” ones!!), the answer is “yes”. In this post, I present a teaser on some relatively old work that has been done on the subject. Next time, we’ll dig deeply into how this works!

7. An observational criterion for causation (coming soon)…