Andrew Gelman and Hal Varian on Instrumental Variables

Andrew Gelmen

The trick: how to think about IV’s without getting too confused

Suppose z is your instrument, T is your treatment, and y is your outcome. So the causal model is z -> T -> y. The trick is to think of (T,y) as a joint outcome and to think of the effect of z on each. For example, an increase of 1 in z is associated with an increase of 0.8 in T and an increase of 10 in y. The usual “instrumental variables” summary is to just say the estimated effect of T on y is 10/0.8=12.5, but I’d rather just keep it separate and report the effects on T and y separately…

If there’s any problem with the simple correlation, I see the same problems with the more elaborate analysis–the pair of correlations which is given the label “instrumental variables analysis.” I’m not opposed to instrumental variables in general, but when I get stuck, I find it extremely helpful to go back and see what I’ve learned from separately thinking about the correlation of z with T, and the correlation of z with y. Since that’s ultimately what instrumental variables analysis is doing.

Hal Varian adds

You have to assume that the only way that z affects Y is through the treatment, T. So the IV model is
T = az + e
y = bT + d
It follows that
E(y|z) = b E(T|z) + E(d|z)
Now if we
1) assume E(d|z) = 0
2) verify that E(T|z) != 0
we can solve for b by division. Of course, assumption 1 is untestable.

An extreme case is a purely randomized experiment, where e=0 and z is a coin flip.