Member-only story
How to Calculate Conditional Probabilities from Any DataFrame in 3 Lines of Code
Learn to write a simple Python function that will calculate conditional probabilities using notation like p(exam=1 | study=1)
Background
As I have continued to delve into causal inference I reached a stage where I needed to be able to construct formulas that use complex combinations of conditional probabilities and the code was starting to be difficult to read and maintain. This led to me developing a simple way to calculate conditional probabilities.
By the end of this article you will be equipped with a short Python function that can apply a conditional probability directly to a any pandas
DataFrame
and return the result.
A Recap
Conditional probability is the probability of an event occurring given that another event has already occurred. The notation 𝑃(𝐴∣𝐵) can be understood as the probability (P) of event A given (|) that event B has already occurred.
To extend this one step forward the notation 𝑃(𝐴∣𝐵,𝐶) means the probability of event A given that both event B and event C have already occurerd.