Exact Matching

Motivating Matching for Causality

  • The goal of any causal analysis is to isolate some causal effect
  • To do this, we must satisfy the backdoor criterion in our study

    • Meaning, we must close all open backdoor paths
  • Closing backdoor paths can be achieved through carefully performing conditioning strategies in our study
  • Roughly, there are three different types of conditioning strategies:

    • Subclassification
    • Exact matching
    • Approximate matching

Motivating the Conditional Independence Assumption

  • Conditional independence assumption (or CIA) states that a treatment assignment is independent of potential outcomes after conditioning on observed covariates
  • Sometimes we know that randomization occurred only conditional on some observable characteristics

    • This would violate the backdoor path criterion
  • In order to estimate a causal effect when there is a confounder, we must satisfy CIA

    • In DAGs notation, this refers to enforcing closed paths everywhere for confounders
    • Meaning, CIA implies there isn't any confounding bias
(Y1,Y0)TX(Y^{1}, Y^{0}) \perp T | X

Introducing Matching for Estimating ATEATE

  • Matching is one of three conditioning method used for satisfying the backdoor criterion
  • Matching estimates ATEATE by imputing missing potential outcomes by conditioning on the confounding
  • Specifically, we could fill in missing potential outcomes for each treatment unit using a control group unit that was closest to the treatment group unit for some XX confounder
  • This would give us estimates of all the counterfactuals from which we could simply take the average over the differences
  • Specifically, matching ensures that CIA isn't violated

Using Matching instead of Subclassification

  • Subclassification uses the difference between treatment and control group units and achieves covariate balance by using the KK probability weights to weight the averages
  • As long as there is enough data for stratifying our covariates, subclassification can be a viable option
  • However, if subclassification suffers from the curse of dimensionality, then we must use other methods (like matching)
  • Typically, curse of dimensionality exists, so we'll prefer other methods like matching
  • Specifically, subclassification is a weighting method used on all individuals, regardless of the overlap of distributions
  • Whereas, matching is a form of stratification (or sampling method) that attempts to match distributions

Illustrating Exact Matching

  • Suppose we have the following data:

    • Where, our earnings is YY
    • And, our age is a confounder XX
    • And, an observation is either a trainees or non-trainees

      • Which represents our treatment variable TT
Trainees Non-Trainess Matched Sample
Unit Age Earnings Unit Age Earnings Unit Age Earnings
1 18 9500 1 20 8500 14 18 8050
2 29 12250 2 27 10075 6 29 10525
3 24 11000 3 21 8725 9 24 9400
4 27 11750 4 39 12775 2 27 10075
5 33 13250 5 38 12550 11 33 11425
6 22 10500 6 29 10525 13 22 8950
7 19 9750 7 39 12775 17 19 8275
8 20 10000 8 33 11425 1 20 8500
9 21 10250 9 24 9400 3 21 8725
10 30 12500 10 30 10750 avg(10,18) 30 9875
11 33 11425
12 36 12100
13 22 8950
14 18 8050
15 43 13675
16 39 12775
17 19 8275
18 30 9000
19 51 15475
20 48 14800
Mean 24.3 $11075 31.95 $11101 24.3 $9380
  • Notice, the treatment and control groups have different age distributions

    • So, we create a third group sampling from the non-trainees group to match the age distribution of the trainess group
    • By imputing missing counterfactuals, we satisfy the CIA (which would have been violated otherwise)
  • Now, estimating ATEATE on this matched sample provides a better estimate:
δ^ATE=1Ni=1N(2Di1)(Yi(1Mm=1MYjmi))\hat{\delta}_{ATE} = \frac{1}{N} \sum_{i=1}^{N} (2D_{i}-1)(Y_{i} - (\frac{1}{M} \sum_{m=1}^{M} Y_{j_{m}i}))
  • And, Yj(i)Y_{j(i)} refers to the jthj^{th} unit matched to the ithi^{th} unit based on the jthj^{th} being closest to the ithi^{th} unit for some XX covariate

    • Here, jj refers to an index in the treatment group
    • Whereas, ii refers to an index in the control group

Motivating Approximate Matching

  • Exact matching works well if we can find another unit with that exact same value we're looking for in the other group
  • Otherwise, we'll need to us approximate matching

References

Previous
Next

Subclassification

Approximate Matching