delicatessen.utilities.aft_predictions_function
- aft_predictions_function(X, times, theta, covariance, distribution, measure='survival', alpha=0.05)
Compute estimated functions for survival analysis measures from an accelerated failure time (AFT) model across a specified time period. This function is meant to be used with
ee_aftand is a simple way to compute values of a survival analysis metric (and the corresponding point-wise confidence intervals) at user-specified time points for a given pattern of covariates. This functionality is meant to help with generating plots or describing results.To generate predicted values of the desired measure, the survival and hazard are computed using
\[\begin{split}S(t) = S_{\epsilon}\left( \frac{\log(t) - X \beta^T}{\sigma} \right) \\ h(t) = (\sigma t)^{-1} h_{\epsilon}\left( \frac{\log(t) - X \beta^T}{\sigma} \right)\end{split}\]where the corresponding function for the given AFT distribution is
Distribution
Keyword
\(S_\epsilon(x)\)
\(h_\epsilon(x)\)
Exponential
exponential\(\exp(-\exp(x))\)
\(\exp(x)\)
Weibull
weibull\(\exp(-\exp(x))\)
\(\exp(x)\)
Log-Logistic
log-logistic\((1 - \exp(x))^{-1}\)
\((1 - \exp(-x))^{-1}\)
Log-Normal
log-normal\(1 - \Phi(x)\)
\(\frac{\exp(-x^2 / 2)}{[1 - \Phi(x)] \sqrt{2 \pi }}\)
Note that one only needs to ensure that
distributionis set to the same argument as the one used inee_aftFrom these values, the specified measure is computed (see
convert_survival_measuresfor details). The variance for the chosen measure is then computed using the Delta Method with automatic differentiation via thedelta_methodfunction. Corresponding :math:`1-alpha`% Wald-type point-wise confidence intervals are then computed using this variance- Parameters
X (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables.
times (float, int, ndarray, list, vector) – Either a single time point or a vector of time points to generate predicted measures at. This argument determines the shape of the output.
theta (ndarray, list, vector) – Estimated coefficients from
MEstimator.thetawithee_aft.covariance (ndarray, list, vector) – Estimated covariance matrix from
MEstimator.variancewithee_aft.distribution (str) – Distribution to use for the AFT model. See table for options.
measure (str, optional) – Measure to compute. Options include survival (
'survival'), density ('density'), risk or the cumulative density ('risk'), hazard ('hazard'), or cumulative hazard ('cumulative_hazard'). Default is survivalalpha (float, optional) – The \(\alpha\) level for the corresponding confidence intervals. Default is 0.05, which calculate the 95% confidence intervals. Notice that \(0 < \alpha < 1\).
- Returns
Returns a t-by-4 NumPy array of predictions, where the first column is the survival metric, the second is the corresponding variance, and the last two columns are the lower confidence limit and upper confidence limit, respectively.
- Return type
array
Examples
The following illustrates how to use
aft_predictions_functionto generate a plot of the risk function. Other metrics can be plotted using a similar approach.>>> import numpy as np >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> from delicatessen import MEstimator >>> from delicatessen.estimating_equations import ee_aft >>> from delicatessen.utilities import aft_predictions_function >>> from delicatessen.data import load_breast_cancer
Loading breast cancer data from Collett 2015
>>> dat = load_breast_cancer() >>> delta = dat[:, 0] >>> t = dat[:, 1] >>> covars = np.asarray([np.ones(dat.shape[0]), dat[:, 0]]).T
Estimating the parameters of a Weibull AFT model
>>> def psi(theta): >>> return ee_aft(theta=theta, t=t, delta=delta, >>> X=covars, distribution='weibull')
>>> estr = MEstimator(psi, init=[5., 0., 0.]) >>> estr.estimate()
Now we can generate predicted values of the risk at specified times for a specific covariate pattern. Suppose we wanted the risk at a time of 50 for those with a negative stain and the corresponding confidence intervals. The following code gives us the risk, variance, and confidence intervals
>>> aft_predictions_function(times=50., theta=estr.theta, covariance=estr.variance, >>> X=[[1, 0]], # Intercept-only >>> distribution='weibull', measure='risk')
We can do the same process for those with a positive stain by switching out the design matrix
>>> aft_predictions_function(times=50., theta=estr.theta, covariance=estr.variance, >>> X=[[1, 1]], # Intercept and positive stain >>> distribution='weibull', measure='risk')
Now, we will use these predictions to plot the risk function over the time period. We generate a vector of times for the plot. These should be chosen ‘densely’, so the plot appears smooth
>>> # Time steps to generate predictions for >>> times = np.linspace(0.01, 230, 100) >>> # Generating predictions >>> s0_hat = survival_predictions(times=times, theta=estr.theta, covariance=estr.variance, >>> X=[[1, 0]], # Intercept only >>> distribution='weibull', measure='risk') >>> s1_hat = survival_predictions(times=times, theta=estr.theta, covariance=estr.variance, >>> X=[[1, 1]], # Intercept and positive stain >>> distribution='weibull', measure='risk') >>> # Plot >>> plt.fill_between(times, s1_hat[:, 2], s1_hat[:, 3], color='blue', alpha=0.3) >>> plt.fill_between(times, s0_hat[:, 2], s0_hat[:, 3], color='red', alpha=0.3) >>> plt.plot(times, s1_hat[:, 0], '-', color='blue', alpha=0.3, label='Pos') >>> plt.plot(times, s0_hat[:, 0], '-', color='red', alpha=0.3, label='Neg') >>> plt.xlabel("Time") >>> plt.ylabel("Risk") >>> plt.legend() >>> plt.show()
Here, the
fill_betweendisplays the point-wise 95% confidence intervals. Other survival measures can be requested through the optionalmeasureargument.References
Collett D. (2015). Accelerated failure time and other parametric models. In: Modelling survival data in medical research. CRC press. pg 242