delicatessen.estimating_equations.causal.ee_aipw
- ee_aipw(theta, y, A, W, X, X1, X0, truncate=None, force_continuous=False)
Estimating equation for augmented inverse probability weighting (AIPW) estimator. AIPW consists of two nuisance models (the propensity score model and the outcome model).
The stacked estimating equations are
\[\begin{split}\sum_{i=1}^n \begin{bmatrix} (\theta_1 - \theta_2) - \theta_0 \\ \frac{A_i Y_i}{\pi_i} - \frac{\hat{Y^1} ( A_i - \pi_i ) }{\pi_i} - \theta_1 \\ \frac{(1 - A_i) Y_i}{1 - \pi_i} + \frac{\hat{Y^0} ( A_i - \pi_i ) }{1 - \pi_i} - \theta_2 \\ \left\{ A_i - \text{expit}(W_i^T \alpha) \right\} W_i \\ \left\{ Y_i - g(X_i^T \beta) \right\} X_i \end{bmatrix} = 0\end{split}\]where \(A\) is the action and \(W\) is the set of confounders, \(Y\) is the outcome, and \(\pi_i = \text{expit}(W_i^T \alpha)\). The first estimating equation is for the average causal effect, the second is for the mean under \(A:=1\), the third is for the mean under \(A:=0\), the fourth is the logistic regression model for the propensity scores, and the last is for the outcome model. Here, the length of the theta vector is 3+`b`+`c`, where b is the number of parameters in the propensity score model and c is the number of parameters in the outcome model.
By default, ee_aipw detects whether y is all binary (zero or one), and applies logistic regression. Notice that
Xhere should consists of bothAandW(with possible interaction terms or other differences in functional forms from the propensity score model).- Parameters
theta (ndarray, list, vector) – Theta consists of 3+`b`+`c` values.
y (ndarray, list, vector) – 1-dimensional vector of n observed values.
A (ndarray, list, vector) – 1-dimensional vector of n observed values. The A values should all be 0 or 1.
W (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables to model the probability of
Awith.X (ndarray, list, vector) – 2-dimensional vector of n observed values for c variables to model the outcome
y.X1 (ndarray, list, vector) – 2-dimensional vector of n observed values for c variables under the action plan where
A=1for all units.X0 (ndarray, list, vector, None, optional) – 2-dimensional vector of n observed values for c variables under the action plan where
A=0for all units.truncate (None, list, set, ndarray, optional) – Bounds to truncate the estimated probabilities of
Aat. For example, estimated probabilities above 0.99 or below 0.01 can be set to 0.99 or 0.01, respectively. This is done by specifyingtruncate=(0.01, 0.99). Note this step is done vianumpy.clip(.., a_min, a_max), so order is important. Default isNone, which applies to no truncation.force_continuous (bool, optional) – Option to force the use of linear regression despite detection of a binary variable.
- Returns
Returns a (3+`b`+`c`)-by-n NumPy array evaluated for the input
theta- Return type
array
Examples
Construction of an estimating equation(s) with
ee_aipwshould be done similar to the following>>> import numpy as np >>> import pandas as pd >>> from delicatessen import MEstimator >>> from delicatessen.estimating_equations import ee_aipw
Some generic data
>>> n = 200 >>> d = pd.DataFrame() >>> d['W'] = np.random.binomial(1, p=0.5, size=n) >>> d['A'] = np.random.binomial(1, p=(0.25 + 0.5*d['W']), size=n) >>> d['Ya0'] = np.random.binomial(1, p=(0.75 - 0.5*d['W']), size=n) >>> d['Ya1'] = np.random.binomial(1, p=(0.75 - 0.5*d['W'] - 0.1*1), size=n) >>> d['Y'] = (1-d['A'])*d['Ya0'] + d['A']*d['Ya1'] >>> d['C'] = 1
Defining psi, or the stacked estimating equations. Note that
Ais the action of interest. First, we will apply some necessary data processing. We will create an interaction term betweenAandWin the original data. Then we will generate a copy of the data and update the values ofA=1, and then generate another copy but setA=0in that copy.>>> d['AW'] = d['A']*d['W'] >>> d1 = d.copy() # Copy where all A=1 >>> d1['A'] = 1 >>> d1['AW'] = d1['A']*d1['W'] >>> d0 = d.copy() # Copy where all A=0 >>> d0['A'] = 0 >>> d0['AW'] = d0['A']*d0['W']
Having setup our data, we can now define the psi function.
>>> def psi(theta): >>> return ee_aipw(theta, >>> y=d['Y'], >>> A=d['A'], >>> W=d[['C', 'W']], >>> X=d[['C', 'A', 'W', 'AW']], >>> X1=d1[['C', 'A', 'W', 'AW']], >>> X0=d0[['C', 'A', 'W', 'AW']])
Calling the M-estimator. AIPW has 3 parameters with 2 coefficients in the propensity score model, and 4 coefficients in the outcome model, the total number of initial values should be 3+2+4=9. When Y is binary, it will be best to start with
[0., 0.5, 0.5, ...]followed by all0.for the initial values. Otherwise, starting with all 0. as initials is reasonable.>>> estr = MEstimator(psi, >>> init=[0., 0.5, 0.5, 0., 0., 0., 0., 0., 0.]) >>> estr.estimate(solver='lm')
Inspecting the parameter estimates, variance, and 95% confidence intervals
>>> estr.theta >>> estr.variance >>> estr.confidence_intervals()
More specifically, the corresponding parameters are
>>> estr.theta[0] # causal mean difference of 1 versus 0 >>> estr.theta[1] # causal mean under A=1 >>> estr.theta[2] # causal mean under A=0 >>> estr.theta[3:5] # propensity score regression coefficients >>> estr.theta[5:] # outcome regression coefficients
References
Hernán MA, & Robins JM. (2006). Estimating causal effects from epidemiological data. Journal of Epidemiology & Community Health, 60(7), 578-586.
Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, & Davidian M. (2011). Doubly robust estimation of causal effects. American Journal of Epidemiology, 173(7), 761-767.
Tsiatis AA. (2006). Semiparametric theory and missing data. Springer, New York, NY.