delicatessen.estimating_equations.causal.ee_aipw

ee_aipw(theta, y, A, W, X, X1, X0, truncate=None, force_continuous=False)

Estimating equation for augmented inverse probability weighting (AIPW) estimator. AIPW consists of two nuisance models (the propensity score model and the outcome model).

The stacked estimating equations are

\[\begin{split}\sum_{i=1}^n \begin{bmatrix} (\theta_1 - \theta_2) - \theta_0 \\ \frac{A_i Y_i}{\pi_i} - \frac{\hat{Y^1}(A_i-\pi_i}{\pi_i} - \theta_1 \\ \frac{(1-A_i) Y_i}{1-\pi_i} + \frac{\hat{Y^0}(A_i-\pi_i}{1-\pi_i} - \theta_2 \\ \left\{ A_i - \text{expit}(W_i^T \alpha) \right\} W_i \\ \left\{ Y_i - g(X_i^T \beta) \right\} X_i \end{bmatrix} = 0\end{split}\]

where \(A\) is the action and \(W\) is the set of confounders, \(Y\) is the outcome, and \(\pi_i = \text{expit}(W_i^T \alpha)\). The first estimating equation is for the average causal effect, the second is for the mean under \(A:=1\), the third is for the mean under \(A:=0\), the fourth is the logistic regression model for the propensity scores, and the last is for the outcome model. Here, the length of the theta vector is 3+`b`+`c`, where b is the number of parameters in the propensity score model and c is the number of parameters in the outcome model.

By default, ee_aipw detects whether y is all binary (zero or one), and applies logistic regression. Notice that X here should consists of both A and W (with possible interaction terms or other differences in functional forms from the propensity score model).

Parameters

theta (ndarray, list, vector) – Theta consists of 3+`b`+`c` values.
y (ndarray, list, vector) – 1-dimensional vector of n observed values.
A (ndarray, list, vector) – 1-dimensional vector of n observed values. The A values should all be 0 or 1.
W (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables to model the probability of A with.
X (ndarray, list, vector) – 2-dimensional vector of n observed values for c variables to model the outcome y.
X1 (ndarray, list, vector) – 2-dimensional vector of n observed values for c variables under the action plan where A=1 for all units.
X0 (ndarray, list, vector, None, optional) – 2-dimensional vector of n observed values for c variables under the action plan where A=0 for all units.
truncate (None, list, set, ndarray, optional) – Bounds to truncate the estimated probabilities of A at. For example, estimated probabilities above 0.99 or below 0.01 can be set to 0.99 or 0.01, respectively. This is done by specifying truncate=(0.01, 0.99). Note this step is done via numpy.clip(.., a_min, a_max), so order is important. Default is None, which applies to no truncation.
force_continuous (bool, optional) – Option to force the use of linear regression despite detection of a binary variable.

Returns

Returns a (3+`b`+`c`)-by-n NumPy array evaluated for the input theta

Return type

array

Examples

Construction of a estimating equation(s) with ee_aipw should be done similar to the following

>>> import numpy as np
>>> import pandas as pd
>>> from delicatessen import MEstimator
>>> from delicatessen.estimating_equations import ee_aipw

Some generic data

>>> n = 200
>>> d = pd.DataFrame()
>>> d['W'] = np.random.binomial(1, p=0.5, size=n)
>>> d['A'] = np.random.binomial(1, p=(0.25 + 0.5*d['W']), size=n)
>>> d['Ya0'] = np.random.binomial(1, p=(0.75 - 0.5*d['W']), size=n)
>>> d['Ya1'] = np.random.binomial(1, p=(0.75 - 0.5*d['W'] - 0.1*1), size=n)
>>> d['Y'] = (1-d['A'])*d['Ya0'] + d['A']*d['Ya1']
>>> d['C'] = 1

Defining psi, or the stacked estimating equations. Note that A is the action of interest. First, we will apply some necessary data processing. We will create an interaction term between A and W in the original data. Then we will generate a copy of the data and update the values of A=1, and then generate another copy but set A=0 in that copy.

>>> d['AW'] = d['A']*d['W']
>>> d1 = d.copy()   # Copy where all A=1
>>> d1['A'] = 1
>>> d1['AW'] = d1['A']*d1['W']
>>> d0 = d.copy()   # Copy where all A=0
>>> d0['A'] = 0
>>> d0['AW'] = d0['A']*d0['W']

Having setup our data, we can now define the psi function.

>>> def psi(theta):
>>>     return ee_aipw(theta,
>>>                    y=d['Y'],
>>>                    A=d['A'],
>>>                    W=d[['C', 'W']],
>>>                    X=d[['C', 'A', 'W', 'AW']],
>>>                    X1=d1[['C', 'A', 'W', 'AW']],
>>>                    X0=d0[['C', 'A', 'W', 'AW']])

Calling the M-estimator. AIPW has 3 parameters with 2 coefficients in the propensity score model, and 4 coefficients in the outcome model, the total number of initial values should be 3+2+4=9. When Y is binary, it will be best to start with [0., 0.5, 0.5, ...] followed by all 0. for the initial values. Otherwise, starting with all 0. as initials is reasonable.

>>> estr = MEstimator(psi,
>>>                   init=[0., 0.5, 0.5, 0., 0., 0., 0., 0., 0.])
>>> estr.estimate(solver='lm')

Inspecting the parameter estimates, variance, and 95% confidence intervals

>>> estr.theta
>>> estr.variance
>>> estr.confidence_intervals()

More specifically, the corresponding parameters are

>>> estr.theta[0]     # causal mean difference of 1 versus 0
>>> estr.theta[1]     # causal mean under A=1
>>> estr.theta[2]     # causal mean under A=0
>>> estr.theta[3:5]   # propensity score regression coefficients
>>> estr.theta[5:]    # outcome regression coefficients

References

Hernán MA, & Robins JM. (2006). Estimating causal effects from epidemiological data. Journal of Epidemiology & Community Health, 60(7), 578-586.

Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, & Davidian M. (2011). Doubly robust estimation of causal effects. American Journal of Epidemiology, 173(7), 761-767.

Tsiatis AA. (2006). Semiparametric theory and missing data. Springer, New York, NY.