delicatessen.estimating_equations.causal.ee_ipw_msm

ee_ipw_msm(theta, y, A, W, V, distribution, link, hyperparameter=None, truncate=None, weights=None)

Estimating equation for parameters of a marginal structural model estimated using inverse probability weighting. For estimation of the propensity scores, a logistic model is used.

The stacked estimating equations are

\[\begin{split}\sum_{i=1}^n \begin{bmatrix} \frac{1}{\pi_i} \left\{ Y_i - g^{-1}(X_i^T \beta) \right\} \times \frac{D(\beta)}{v(\beta)} X_i \\ \left\{ A_i - \text{expit}(W_i^T \alpha) \right\} W_i \end{bmatrix} = 0\end{split}\]

where \(A\) is the action, math:W is the set of confounders, and \(\pi_i = \text{expit}(W_i^T \alpha)\). Here, \(X\) is the design matrix for the marginal structural model (it includes \(A\), and possibly some covariates from \(W\)). The first estimating equation is a weighted generalized linear model is used. See ee_glm for details on this estimating equation. The second estimating equation is the logistic model for the propensity scores.

Here, theta corresponds to multiple quantities. The first set of values correspond to the parameters of the marginal structural model, and the second set correspond to the logistic regression model coefficients for the propensity scores.

Parameters
  • theta (ndarray, list, vector) – Theta consists of c`+`b values.

  • y (ndarray, list, vector) – 1-dimensional vector of n observed values.

  • A (ndarray, list, vector) – 1-dimensional vector of n observed values. The A values should all be 0 or 1.

  • W (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables to model the probability of A with.

  • V (ndarray, list, vector) – 2-dimensional vector of n observed values for c variables in the marginal structural model.

  • distribution (str) – Distribution for the generalized linear model. See ee_glm for options.

  • link (str) – Link function for the generalized linear model. See ee_glm for options.

  • truncate (None, list, set, ndarray, optional) – Bounds to truncate the estimated probabilities of A at. For example, estimated probabilities above 0.99 or below 0.01 can be set to 0.99 or 0.01, respectively. This is done by specifying truncate=(0.01, 0.99). Note this step is done via numpy.clip(.., a_min, a_max), so order is important. Default is None, which applies no truncation.

  • weights (ndarray, list, vector, None, optional) – 1-dimensional vector of n weights. Default is None, which assigns a weight of 1 to all observations. This argument is intended to support the use of missingness weights. The propensity score model is not fit using these weights.

Returns

Returns a (c`+`b)-by-n NumPy array evaluated for the input theta.

Return type

array

Examples

Construction of a estimating equation(s) with ee_ipw_msm should be done similar to the following

>>> import numpy as np
>>> import pandas as pd
>>> from delicatessen import MEstimator
>>> from delicatessen.estimating_equations import ee_ipw_msm

Some generic data

>>> n = 200
>>> d = pd.DataFrame()
>>> d['W'] = np.random.binomial(1, p=0.5, size=n)
>>> d['A'] = np.random.binomial(1, p=(0.25 + 0.5*d['W']), size=n)
>>> d['Ya0'] = np.random.binomial(1, p=(0.75 - 0.5*d['W']), size=n)
>>> d['Ya1'] = np.random.binomial(1, p=(0.75 - 0.5*d['W'] - 0.1*1), size=n)
>>> d['Y'] = (1-d['A'])*d['Ya0'] + d['A']*d['Ya1']
>>> d['C'] = 1

Defining psi, or the stacked estimating equations for a logistic marginal structural model. Note that ‘A’ is the action.

>>> def psi(theta):
>>>     return ee_ipw_msm(theta, y=d['Y'], A=d['A'],
>>>                       W=d[['C', 'W']], V=d[['C', 'A']],
>>>                       link='logit', distribution='binomial')

Calling the M-estimation procedure. Since W is 2-by-n here and the marginal structural model (V) consists of 2 parameters, the initial values should be of length 2+2=4. Starting values for the marginal structural model may need to be adjusted to lie within the domain of the chosen link-distribution functions.

>>> estr = MEstimator(stacked_equations=psi, init=[0., 0., 0., 0.])
>>> estr.estimate(solver='lm')

Inspecting the parameter estimates, variance, and 95% confidence intervals

>>> estr.theta
>>> estr.variance
>>> estr.confidence_intervals()

More specifically, the corresponding parameters are

>>> estr.theta[0:2]  # Marginal structural model paramters
>>> estr.theta[2:]   # Nuisance model parameters

If you want to see how truncating the probabilities works, try repeating the above code but specifying truncate=(0.1, 0.9) as an optional argument in ee_ipw_msm.

References

Hernán MA, & Robins JM. (2006). Estimating causal effects from epidemiological data. Journal of Epidemiology & Community Health, 60(7), 578-586.

Cole SR, & Hernán MA. (2008). Constructing inverse probability weights for marginal structural models. American Journal of Epidemiology, 168(6), 656-664.