Build causal graphs and apply Pearl's do-calculus—distinguish correlation from causation with interventions
Judea Pearl's causal inference framework distinguishes between mere correlation and genuine causation using directed acyclic graphs (DAGs) and the do-operator. This is fundamental to understanding when we can legitimately infer cause from effect.
A causal graph represents variables as nodes and direct causal relationships as directed edges. The structure encodes which variables causally influence others:
The critical distinction in causal inference is between passive observation and active intervention:
When we intervene on X, we "cut" all incoming arrows to X, removing confounding effects and isolating the causal effect.
A backdoor path is a non-causal path from X to Y that goes through a common cause. For example, in X ← Z → Y, the path X ← Z → Y is a backdoor path that creates correlation without causation.
To identify the causal effect of X on Y from observational data, we must block all backdoor paths by conditioning on appropriate variables (the backdoor criterion).
Simpson's paradox occurs when a trend appears in different groups but reverses when the groups are combined. This happens when we fail to account for a confounder:
Causal graphs make clear when we should condition on Z (to remove confounding) and when we should not (to avoid conditioning on a collider or mediator).
Pearl's do-calculus provides three rules for transforming expressions involving the do-operator, allowing us to determine when causal effects can be identified from observational data. The key insight: interventions break incoming causal arrows, changing the graph structure.