delicatessen.estimating_equations.survival.ee_aft_weibull
- ee_aft_weibull(theta, X, t, delta, weights=None)
Estimating equation for accelerated failure time (AFT) model with a Weibull distribution. Let \(T_i\) indicate the time of the event and \(C_i\) indicate the time to right censoring. Therefore, the observable data consists of \(t_i = min(T_i, C_i)\) and \(\Delta_i = I(t_i = T_i)\). The estimating equations are
\[\begin{split}\sum_{i=1}^n = \begin{bmatrix} \frac{\Delta_i}{\lambda} - t_i^{\gamma} \exp(\beta X_i) \\ \Delta_i X_i - (\lambda t_i^{\gamma} \exp(\beta X_i))X_i \\ \frac{\Delta_i}{\gamma} + \Delta_i \log(t) - \lambda t_i^{\gamma} \exp(\beta X_i) \log(t) \end{bmatrix} = 0\end{split}\]The AFT consists of the following parameters: \(\mu, \beta, \sigma\). The above estimating equations use the proportional hazards form of the Weibull model. For the Weibull AFT, notice the following relation between the coefficients: \(\lambda = - \mu \gamma\), \(\beta_{PH} = - \beta_{AFT} \gamma\), and \(\gamma = \exp(\sigma)\).
Here, \(\theta\) is a 1-by-(2+`b`) array, where b is the distinct covariates included as part of
X
. For example, ifX
is a 3-by-n matrix, then theta will be a 1-by-5 array. The code is general to allow for an arbitrary dimension ofX
.- Parameters
theta (ndarray, list, vector) – theta consists of 1+`b`+1 values. Therefore, initial values should consist of the same number as the number of columns present in
X
plus 2. This can easily be implemented via[0, ] + [0, ] * X.shape[1] + [0, ]
.X (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables.
t (ndarray, list, vector) – 1-dimensional vector of n observed times.
delta (ndarray, list, vector) – 1-dimensional vector of n values indicating whether the time was an event or censoring.
weights (ndarray, list, vector, None, optional) – 1-dimensional vector of n weights. Default is
None
, which assigns a weight of 1 to all observations.
- Returns
Returns a 1+`b`+1-by-n NumPy array evaluated for the input
theta
.- Return type
array
Examples
Construction of a estimating equation(s) with
ee_aft_weibull
should be done similar to the following>>> import numpy as np >>> import pandas as pd >>> from delicatessen import MEstimator >>> from delicatessen.estimating_equations import ee_aft_weibull
Some generic survival data to estimate a Weibull AFT regresion model
>>> n = 100 >>> data = pd.DataFrame() >>> data['X'] = np.random.binomial(n=1, p=0.5, size=n) >>> data['W'] = np.random.binomial(n=1, p=0.5, size=n) >>> data['T'] = (1/1.25 + 1/np.exp(0.5)*data['X'])*np.random.weibull(a=0.75, size=n) >>> data['C'] = np.random.weibull(a=1, size=n) >>> data['C'] = np.where(data['C'] > 10, 10, data['C']) >>> data['delta'] = np.where(data['T'] < data['C'], 1, 0) >>> data['t'] = np.where(data['delta'] == 1, data['T'], data['C']) >>> d_obs = data[['X', 'W', 't', 'delta']].copy()
Defining psi, or the stacked estimating equations
>>> def psi(theta): >>> return ee_aft_weibull(theta=theta, X=d_obs[['X', 'W']], >>> t=d_obs['t'], delta=d_obs['delta'])
Calling the M-estimator (note that init has 4 values now, since
X.shape[1]
is 2).>>> estr = MEstimator(stacked_equations=psi, init=[0., 0., 0., 0.]) >>> estr.estimate(solver='hybr')
Inspecting the parameter estimates, variance, and confidence intervals
>>> estr.theta >>> estr.variance >>> estr.confidence_intervals()
Inspecting parameter the specific parameter estimates
>>> estr.theta[0] # log(mu) (scale) >>> estr.theta[1:-1] # log(beta) (scale coefficients) >>> estr.theta[-1] # log(sigma) (shape)
References
Collett D. (2015). Parametric proportional hazards models In: Modelling survival data in medical research. CRC press. pg171-220
Collett D. (2015). Accelerated failure time and other parametric models. In: Modelling survival data in medical research. CRC press. pg171-220