delicatessen.estimating_equations.causal.ee_aipw
- ee_aipw(theta, y, A, W, X, X1, X0, truncate=None, force_continuous=False)
Estimating equation for augmented inverse probability weighting (AIPW) estimator. AIPW consists of two nuisance models (the propensity score model and the outcome model).
The stacked estimating equations are
\[\begin{split}\sum_{i=1}^n \begin{bmatrix} (\theta_1 - \theta_2) - \theta_0 \\ \frac{A_i Y_i}{\pi_i} - \frac{\hat{Y^1}(A_i-\pi_i}{\pi_i} - \theta_1 \\ \frac{(1-A_i) Y_i}{1-\pi_i} + \frac{\hat{Y^0}(A_i-\pi_i}{1-\pi_i} - \theta_2 \\ \left\{ A_i - \text{expit}(W_i^T \alpha) \right\} W_i \\ \left\{ Y_i - g(X_i^T \beta) \right\} X_i \end{bmatrix} = 0\end{split}\]where \(A\) is the action and \(W\) is the set of confounders, \(Y\) is the outcome, and \(\pi_i = \text{expit}(W_i^T \alpha)\). The first estimating equation is for the average causal effect, the second is for the mean under \(A:=1\), the third is for the mean under \(A:=0\), the fourth is the logistic regression model for the propensity scores, and the last is for the outcome model. Here, the length of the theta vector is 3+`b`+`c`, where b is the number of parameters in the propensity score model and c is the number of parameters in the outcome model.
By default, ee_aipw detects whether y is all binary (zero or one), and applies logistic regression. Notice that
X
here should consists of bothA
andW
(with possible interaction terms or other differences in functional forms from the propensity score model).- Parameters
theta (ndarray, list, vector) – Theta consists of 3+`b`+`c` values.
y (ndarray, list, vector) – 1-dimensional vector of n observed values.
A (ndarray, list, vector) – 1-dimensional vector of n observed values. The A values should all be 0 or 1.
W (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables to model the probability of
A
with.X (ndarray, list, vector) – 2-dimensional vector of n observed values for c variables to model the outcome
y
.X1 (ndarray, list, vector) – 2-dimensional vector of n observed values for c variables under the action plan where
A=1
for all units.X0 (ndarray, list, vector, None, optional) – 2-dimensional vector of n observed values for c variables under the action plan where
A=0
for all units.truncate (None, list, set, ndarray, optional) – Bounds to truncate the estimated probabilities of
A
at. For example, estimated probabilities above 0.99 or below 0.01 can be set to 0.99 or 0.01, respectively. This is done by specifyingtruncate=(0.01, 0.99)
. Note this step is done vianumpy.clip(.., a_min, a_max)
, so order is important. Default isNone
, which applies to no truncation.force_continuous (bool, optional) – Option to force the use of linear regression despite detection of a binary variable.
- Returns
Returns a (3+`b`+`c`)-by-n NumPy array evaluated for the input
theta
- Return type
array
Examples
Construction of a estimating equation(s) with
ee_aipw
should be done similar to the following>>> import numpy as np >>> import pandas as pd >>> from delicatessen import MEstimator >>> from delicatessen.estimating_equations import ee_aipw
Some generic data
>>> n = 200 >>> d = pd.DataFrame() >>> d['W'] = np.random.binomial(1, p=0.5, size=n) >>> d['A'] = np.random.binomial(1, p=(0.25 + 0.5*d['W']), size=n) >>> d['Ya0'] = np.random.binomial(1, p=(0.75 - 0.5*d['W']), size=n) >>> d['Ya1'] = np.random.binomial(1, p=(0.75 - 0.5*d['W'] - 0.1*1), size=n) >>> d['Y'] = (1-d['A'])*d['Ya0'] + d['A']*d['Ya1'] >>> d['C'] = 1
Defining psi, or the stacked estimating equations. Note that
A
is the action of interest. First, we will apply some necessary data processing. We will create an interaction term betweenA
andW
in the original data. Then we will generate a copy of the data and update the values ofA=1
, and then generate another copy but setA=0
in that copy.>>> d['AW'] = d['A']*d['W'] >>> d1 = d.copy() # Copy where all A=1 >>> d1['A'] = 1 >>> d1['AW'] = d1['A']*d1['W'] >>> d0 = d.copy() # Copy where all A=0 >>> d0['A'] = 0 >>> d0['AW'] = d0['A']*d0['W']
Having setup our data, we can now define the psi function.
>>> def psi(theta): >>> return ee_aipw(theta, >>> y=d['Y'], >>> A=d['A'], >>> W=d[['C', 'W']], >>> X=d[['C', 'A', 'W', 'AW']], >>> X1=d1[['C', 'A', 'W', 'AW']], >>> X0=d0[['C', 'A', 'W', 'AW']])
Calling the M-estimator. AIPW has 3 parameters with 2 coefficients in the propensity score model, and 4 coefficients in the outcome model, the total number of initial values should be 3+2+4=9. When Y is binary, it will be best to start with
[0., 0.5, 0.5, ...]
followed by all0.
for the initial values. Otherwise, starting with all 0. as initials is reasonable.>>> estr = MEstimator(psi, >>> init=[0., 0.5, 0.5, 0., 0., 0., 0., 0., 0.]) >>> estr.estimate(solver='lm')
Inspecting the parameter estimates, variance, and 95% confidence intervals
>>> estr.theta >>> estr.variance >>> estr.confidence_intervals()
More specifically, the corresponding parameters are
>>> estr.theta[0] # causal mean difference of 1 versus 0 >>> estr.theta[1] # causal mean under A=1 >>> estr.theta[2] # causal mean under A=0 >>> estr.theta[3:5] # propensity score regression coefficients >>> estr.theta[5:] # outcome regression coefficients
References
Hernán MA, & Robins JM. (2006). Estimating causal effects from epidemiological data. Journal of Epidemiology & Community Health, 60(7), 578-586.
Funk MJ, Westreich D, Wiesen C, Stürmer T, Brookhart MA, & Davidian M. (2011). Doubly robust estimation of causal effects. American Journal of Epidemiology, 173(7), 761-767.
Tsiatis AA. (2006). Semiparametric theory and missing data. Springer, New York, NY.