delicatessen.estimating_equations.causal.ee_ipw_msm
- ee_ipw_msm(theta, y, A, W, V, distribution, link, hyperparameter=None, truncate=None, weights=None)
Estimating equation for parameters of a marginal structural model estimated using inverse probability weighting. For estimation of the propensity scores, a logistic model is used.
The stacked estimating equations are
\[\begin{split}\sum_{i=1}^n \begin{bmatrix} \frac{1}{\pi_i} \left\{ Y_i - g^{-1}(X_i^T \beta) \right\} \times \frac{D(\beta)}{v(\beta)} X_i \\ \left\{ A_i - \text{expit}(W_i^T \alpha) \right\} W_i \end{bmatrix} = 0\end{split}\]where \(A\) is the action, math:W is the set of confounders, and \(\pi_i = \text{expit}(W_i^T \alpha)\). Here, \(X\) is the design matrix for the marginal structural model (it includes \(A\), and possibly some covariates from \(W\)). The first estimating equation is a weighted generalized linear model is used. See
ee_glm
for details on this estimating equation. The second estimating equation is the logistic model for the propensity scores.Here,
theta
corresponds to multiple quantities. The first set of values correspond to the parameters of the marginal structural model, and the second set correspond to the logistic regression model coefficients for the propensity scores.- Parameters
theta (ndarray, list, vector) – Theta consists of c`+`b values.
y (ndarray, list, vector) – 1-dimensional vector of n observed values.
A (ndarray, list, vector) – 1-dimensional vector of n observed values. The A values should all be 0 or 1.
W (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables to model the probability of
A
with.V (ndarray, list, vector) – 2-dimensional vector of n observed values for c variables in the marginal structural model.
distribution (str) – Distribution for the generalized linear model. See
ee_glm
for options.link (str) – Link function for the generalized linear model. See
ee_glm
for options.truncate (None, list, set, ndarray, optional) – Bounds to truncate the estimated probabilities of
A
at. For example, estimated probabilities above 0.99 or below 0.01 can be set to 0.99 or 0.01, respectively. This is done by specifyingtruncate=(0.01, 0.99)
. Note this step is done vianumpy.clip(.., a_min, a_max)
, so order is important. Default isNone
, which applies no truncation.weights (ndarray, list, vector, None, optional) – 1-dimensional vector of n weights. Default is None, which assigns a weight of 1 to all observations. This argument is intended to support the use of missingness weights. The propensity score model is not fit using these weights.
- Returns
Returns a (c`+`b)-by-n NumPy array evaluated for the input
theta
.- Return type
array
Examples
Construction of a estimating equation(s) with
ee_ipw_msm
should be done similar to the following>>> import numpy as np >>> import pandas as pd >>> from delicatessen import MEstimator >>> from delicatessen.estimating_equations import ee_ipw_msm
Some generic data
>>> n = 200 >>> d = pd.DataFrame() >>> d['W'] = np.random.binomial(1, p=0.5, size=n) >>> d['A'] = np.random.binomial(1, p=(0.25 + 0.5*d['W']), size=n) >>> d['Ya0'] = np.random.binomial(1, p=(0.75 - 0.5*d['W']), size=n) >>> d['Ya1'] = np.random.binomial(1, p=(0.75 - 0.5*d['W'] - 0.1*1), size=n) >>> d['Y'] = (1-d['A'])*d['Ya0'] + d['A']*d['Ya1'] >>> d['C'] = 1
Defining psi, or the stacked estimating equations for a logistic marginal structural model. Note that ‘A’ is the action.
>>> def psi(theta): >>> return ee_ipw_msm(theta, y=d['Y'], A=d['A'], >>> W=d[['C', 'W']], V=d[['C', 'A']], >>> link='logit', distribution='binomial')
Calling the M-estimation procedure. Since
W
is 2-by-n here and the marginal structural model (V
) consists of 2 parameters, the initial values should be of length 2+2=4. Starting values for the marginal structural model may need to be adjusted to lie within the domain of the chosen link-distribution functions.>>> estr = MEstimator(stacked_equations=psi, init=[0., 0., 0., 0.]) >>> estr.estimate(solver='lm')
Inspecting the parameter estimates, variance, and 95% confidence intervals
>>> estr.theta >>> estr.variance >>> estr.confidence_intervals()
More specifically, the corresponding parameters are
>>> estr.theta[0:2] # Marginal structural model paramters >>> estr.theta[2:] # Nuisance model parameters
If you want to see how truncating the probabilities works, try repeating the above code but specifying
truncate=(0.1, 0.9)
as an optional argument inee_ipw_msm
.References
Hernán MA, & Robins JM. (2006). Estimating causal effects from epidemiological data. Journal of Epidemiology & Community Health, 60(7), 578-586.
Cole SR, & Hernán MA. (2008). Constructing inverse probability weights for marginal structural models. American Journal of Epidemiology, 168(6), 656-664.