delicatessen.estimating_equations.survival.ee_survival_model

ee_survival_model(theta, t, delta, distribution)

Estimating equation for a parametric survival models. Let \(T_i\) indicate the time of the event and \(C_i\) indicate the time to right censoring. Therefore, the observable data consists of \(t_i = min(T_i, C_i)\) and \(\Delta_i = I(t_i = T_i)\). The general estimating equations are

\[\sum_{i=1}^n = \begin{bmatrix} \Delta_i \frac{f'(t_i; \theta)}{f(t_i; \theta)} + (1-\Delta_i) \frac{S'(t_i; \theta)}{S(t_i; \theta)} \end{bmatrix} = 0\]

Here, \(\theta\) consists of parameters for the corresponding model. Note that this estimating equation implicitly assumes that the event and censoring times are independent. See the table below for the different survival models and their parametrization in terms of the hazard function.

Distribution

Keyword

Parameters

\(h(t)\)

Exponential

exponential

\(\lambda\)

\(\lambda\)

Weibull

weibull

\(\lambda, \gamma\)

\(\lambda \gamma t^{\gamma - 1}\)

Gompertz

weibull

\(\lambda, \gamma\)

\(\lambda \exp(\gamma t)\)

Parameters
  • theta (ndarray, list, vector) – Theta in the case of the Weibull model consists of two values. Furthermore, the parameter will be non-negative. Therefore, an initial value like the [1, ] is recommended.

  • t (ndarray, list, vector) – 1-dimensional vector of n observed times. No missing data should be included (missing data may cause unexpected behavior).

  • delta (ndarray, list, vector) – 1-dimensional vector of n event indicators, where 1 indicates an event and 0 indicates right censoring. No missing data should be included (missing data may cause unexpected behavior).

  • distribution (str) – Distribution for the parametric survival model.

Returns

Returns a p-by-n NumPy array evaluated for the input theta, where p is the number of parameters in the model.

Return type

array

Examples

Construction of a estimating equation(s) with ee_survival_model should be done similar to the following

>>> import numpy as np
>>> import pandas as pd
>>> from delicatessen import MEstimator
>>> from delicatessen.estimating_equations import ee_survival_model

Some generic survival data to estimate a parametric survival model with

>>> n = 100
>>> data = pd.DataFrame()
>>> data['C'] = np.random.weibull(a=1, size=n)
>>> data['C'] = np.where(data['C'] > 5, 5, data['C'])
>>> data['T'] = 0.8*np.random.weibull(a=0.8, size=n)
>>> data['delta'] = np.where(data['T'] < data['C'], 1, 0)
>>> data['t'] = np.where(data['delta'] == 1, data['T'], data['C'])

Defining psi, or the stacked estimating equations

>>> def psi(theta):
>>>         return ee_survival_model(theta=theta,
>>>                                  t=data['t'], delta=data['delta'],
>>>                                  distribution='weibull')

Calling the M-estimator

>>> estr = MEstimator(stacked_equations=psi, init=[1., 1.])
>>> estr.estimate(solver='lm')

Inspecting the parameter estimates, variance, and confidence intervals

>>> estr.theta
>>> estr.variance
>>> estr.confidence_intervals()

Inspecting parameter the specific parameter estimates

>>> estr.theta

To generate predictions from this model, please use delicatessen.utilities.survival_predictions. See the corresponding documentation for further details.

References

Collett D. (2015). Parametric proportional hazards models In: Modelling Survival Data in Medical Research. CRC press. pg 178-192