delicatessen.estimating_equations.causal.ee_iv_causal

ee_iv_causal(theta, y, A, Z, weights=None)

Estimating equation for instrumental variable (IV) analysis with the usual IV. The estimator is also referred to as the Wald estimator or grouping estimator in the literature. The parameter is the additive effect of action A on outcome Y that leverages the instrument Z. The usual IV estimator is

\[\beta = \frac{E[Y \mid Z=1] - E[Y \mid Z=0]}{E[A \mid Z=1] - E[A \mid Z=0]}\]

One can build a set of estimating equations that consists of 5 parameters (\(\beta\) and four conditional mean). To reduce the number of parameters being estimated, the above expression is instead rewritten following some tedious algebra as the following pair of estimating equations

\[\begin{split}\sum_{i=1}^n \begin{bmatrix} \left[ Y_i - \beta A_i \right] \times \left[ Z_i - \mu \right] \\ Z_i - \mu \end{bmatrix} = 0\end{split}\]

The parameter \(\beta\) corresponds to the causal effect, interpreted following either a homogeneity assumption or monotonicity assuption. The second parameter is simply the mean of \(Z\).

Parameters

theta (ndarray, list, vector) – Theta consists of 2 values.
y (ndarray, list, vector) – 1-dimensional vector of n observed values for the outcome of interest.
A (ndarray, list, vector) – 1-dimensional vector of n observed values for the action of interest.
Z (ndarray, list, vector) – 1-dimensional vector of n observed values for the instrumental variable. The Z values should all be 0 or 1.
weights (ndarray, list, vector, None, optional) – 1-dimensional vector of n weights. Default is None, which assigns a weight of 1 to all observations. This argument is intended to support the use of sampling or missingness weights.

Returns

Returns a 2-by-n NumPy array evaluated for the input theta and y,A,Z

Return type

array

Examples

Construction of an estimating equation(s) with ee_iv_causal should be done similar to the following

>>> import numpy as np
>>> import pandas as pd
>>> from delicatessen import MEstimator
>>> from delicatessen.estimating_equations import ee_iv_causal

Some generic data

>>> n = 200
>>> d = pd.DataFrame()
>>> d['Z'] = np.random.binomial(n=1, p=0.5, size=n)
>>> d['U'] = np.random.normal(size=n)
>>> pr_a = inverse_logit(d['U'] + d['Z'])
>>> d['A'] = np.random.binomial(n=1, p=pr_a, size=n)
>>> d['Y'] = 2*d['A'] - d['U'] + np.random.normal(size=n)

The psi function for the usual IV can be called as

>>> def psi(theta):
>>>     return ee_iv_causal(theta,
>>>                         y=d['Y'],
>>>                         A=d['A'],
>>>                         Z=d['Z'])

Calling the M-estimator for estimation

>>> estr = MEstimator(psi, init=[0., 0.5, ])
>>> estr.estimate(solver='lm')

Inspecting the parameter estimates, variance, and 95% confidence intervals

>>> estr.theta
>>> estr.variance
>>> estr.confidence_intervals()

More specifically, the corresponding parameters are

>>> estr.theta[0]    # Usual IV
>>> estr.theta[1]    # Mean of the instrument Z

References

Boos DD, & Stefanski LA. (2013). M-estimation (estimating equations). In Essential Statistical Inference (pp. 307). Springer, New York, NY.

Wald A. (1940). The fitting of straight lines if both variables are subject to error. The Annals of Mathematical Statistics 11(3), 284-300.