Causal Inference: A Statistical Learning Approach

Propensity Score Matching (PSM)

A statistical technique to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that predict receiving the treatment.

Used in observational studies where random assignment to treatment/control groups is not feasible.

Helps in reducing the selection bias by equating groups based on the covariates.

Idea → the main idea behind PSM is to imitate a randomized controlled trial as closely as possible → making sure treatment and control groups are comparable (using propensity score) in terms of observed covariates → so that the only difference between the two groups is the treatment itself.

Real world applications:

Google → evaluate the effect of an optional software update on Android phones → how would you measure the causal effect of the optional update on user experience?
Amazon → How would you measure the causal effect of Amazon Prime on customer spending?

Propensity score → probability of receiving a particular treatment given a set of observed characteristics (i.e., covariates) → calculated using logistic regression or other methods → used to match units (e.g., a person, user, etc.) in treatment and control groups → create synthetic control group that’s statistically similar to the treatment group (in terms of the covariates).

Steps of PSM:

Estimate propensity scores → logistic regression or similar methods
Matching → nearest neighbor matching, stratification matching, kernel matching