delicatessen.utilities.aft_predictions_individual
- aft_predictions_individual(X, times, theta, distribution, measure='survival')
Compute predicted survival analysis measures from an accelerated failure time (AFT) model for given a design matrix and times. This function is meant to be used with parametrization of the
ee_aftto generate predicted survival (or other measures) at user-specified time points.Predictions are generated via
\[\begin{split}S(t) = S_{\epsilon}\left( \frac{\log(t) - X \beta^T}{\sigma} \right) \\ h(t) = (\sigma t)^{-1} h_{\epsilon}\left( \frac{\log(t) - X \beta^T}{\sigma} \right)\end{split}\]where the corresponding function for the given AFT distribution is
Distribution
Keyword
\(S_\epsilon(x)\)
\(h_\epsilon(x)\)
Exponential
exponential\(\exp(-\exp(x))\)
\(\exp(x)\)
Weibull
weibull\(\exp(-\exp(x))\)
\(\exp(x)\)
Log-Logistic
log-logistic\((1 - \exp(x))^{-1}\)
\((1 - \exp(-x))^{-1}\)
Log-Normal
log-normal\(1 - \Phi(x)\)
\(\frac{\exp(-x^2 / 2)}{[1 - \Phi(x)] \sqrt{2 \pi }}\)
Note that one only needs to ensure that
distributionis set to the same argument as the one used inee_aft- Parameters
X (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables.
times (float, int, ndarray, list, vector) – Either a single time point or a vector of time points to generate predicted measures at. This argument determines the shape of the output.
theta (ndarray, list, vector) – Estimated coefficients from
MEstimator.thetawithee_aft.distribution (str) – Distribution to use for the AFT model. See table for options.
measure (str, optional) – Measure to compute. Options include survival (
'survival'), density ('density'), risk or the cumulative density ('risk'), hazard ('hazard'), or cumulative hazard ('cumulative_hazard'). Default is survival
- Returns
Returns a n-by-t NumPy array of predictions, where n is the number of rows in the design matrix and t is the number of time points.
- Return type
array
Examples
The following illustrates how to use
aft_predictions_individualto generate predicted survival probabilites at specific times for individuals.>>> import numpy as np >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> from delicatessen import MEstimator >>> from delicatessen.estimating_equations import ee_aft >>> from delicatessen.utilities import aft_predictions_individual >>> from delicatessen.data import load_breast_cancer
Loading breast cancer data from Collett 2015
>>> dat = load_breast_cancer() >>> delta = dat[:, 0] >>> t = dat[:, 1] >>> covars = np.asarray([np.ones(dat.shape[0]), dat[:, 0]]).T
Estimating the parameters of a Weibull AFT model
>>> def psi(theta): >>> return ee_aft(theta=theta, t=t, delta=delta, >>> X=covars, distribution='weibull')
>>> estr = MEstimator(psi, init=[5., 0., 0.]) >>> estr.estimate()
Now we can generate predicted values of survival for each observation. Suppose we wanted the survival at time 50 for all units. The following code gives us predicted survival for all units
>>> aft_predictions_individual(X=covars, times=50., >>> theta=estr.theta, >>> distribution='weibull')
Alternatively, we can request the predicted survival at multiple points at once. The following code computes the predicted survival at times 50, 100, 150, 200, 250 for all units.
>>> aft_predictions_individual(X=covars, times=[50, 100, 150, 200, 250], >>> theta=estr.theta, >>> distribution='weibull')
Different survival measures can be requested through the optional
measureargument.References
Collett D. (2015). Accelerated failure time and other parametric models. In: Modelling survival data in medical research. CRC press. pg 242