delicatessen.utilities.aft_predictions_function

aft_predictions_function(X, times, theta, covariance, distribution, measure='survival', alpha=0.05)

Compute estimated functions for survival analysis measures from an accelerated failure time (AFT) model across a specified time period. This function is meant to be used with ee_aft and is a simple way to compute values of a survival analysis metric (and the corresponding point-wise confidence intervals) at user-specified time points for a given pattern of covariates. This functionality is meant to help with generating plots or describing results.

To generate predicted values of the desired measure, the survival and hazard are computed using

\[\begin{split}S(t) = S_{\epsilon}\left( \frac{\log(t) - X \beta^T}{\sigma} \right) \\ h(t) = (\sigma t)^{-1} h_{\epsilon}\left( \frac{\log(t) - X \beta^T}{\sigma} \right)\end{split}\]

where the corresponding function for the given AFT distribution is

Distribution

Keyword

\(S_\epsilon(x)\)

\(h_\epsilon(x)\)

Exponential

exponential

\(\exp(-\exp(x))\)

\(\exp(x)\)

Weibull

weibull

\(\exp(-\exp(x))\)

\(\exp(x)\)

Log-Logistic

log-logistic

\((1 - \exp(x))^{-1}\)

\((1 - \exp(-x))^{-1}\)

Log-Normal

log-normal

\(1 - \Phi(x)\)

\(\frac{\exp(-x^2 / 2)}{[1 - \Phi(x)] \sqrt{2 \pi }}\)

Note that one only needs to ensure that distribution is set to the same argument as the one used in ee_aft

From these values, the specified measure is computed (see convert_survival_measures for details). The variance for the chosen measure is then computed using the Delta Method with automatic differentiation via the delta_method function. Corresponding :math:`1-alpha`% Wald-type point-wise confidence intervals are then computed using this variance

Parameters
  • X (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables.

  • times (float, int, ndarray, list, vector) – Either a single time point or a vector of time points to generate predicted measures at. This argument determines the shape of the output.

  • theta (ndarray, list, vector) – Estimated coefficients from MEstimator.theta with ee_aft.

  • covariance (ndarray, list, vector) – Estimated covariance matrix from MEstimator.variance with ee_aft.

  • distribution (str) – Distribution to use for the AFT model. See table for options.

  • measure (str, optional) – Measure to compute. Options include survival ('survival'), density ('density'), risk or the cumulative density ('risk'), hazard ('hazard'), or cumulative hazard ('cumulative_hazard'). Default is survival

  • alpha (float, optional) – The \(\alpha\) level for the corresponding confidence intervals. Default is 0.05, which calculate the 95% confidence intervals. Notice that \(0 < \alpha < 1\).

Returns

Returns a t-by-4 NumPy array of predictions, where the first column is the survival metric, the second is the corresponding variance, and the last two columns are the lower confidence limit and upper confidence limit, respectively.

Return type

array

Examples

The following illustrates how to use aft_predictions_function to generate a plot of the risk function. Other metrics can be plotted using a similar approach.

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from delicatessen import MEstimator
>>> from delicatessen.estimating_equations import ee_aft
>>> from delicatessen.utilities import aft_predictions_function
>>> from delicatessen.data import load_breast_cancer

Loading breast cancer data from Collett 2015

>>> dat = load_breast_cancer()
>>> delta = dat[:, 0]
>>> t = dat[:, 1]
>>> covars = np.asarray([np.ones(dat.shape[0]), dat[:, 0]]).T

Estimating the parameters of a Weibull AFT model

>>> def psi(theta):
>>>     return ee_aft(theta=theta, t=t, delta=delta,
>>>                   X=covars, distribution='weibull')
>>> estr = MEstimator(psi, init=[5., 0., 0.])
>>> estr.estimate()

Now we can generate predicted values of the risk at specified times for a specific covariate pattern. Suppose we wanted the risk at a time of 50 for those with a negative stain and the corresponding confidence intervals. The following code gives us the risk, variance, and confidence intervals

>>> aft_predictions_function(times=50., theta=estr.theta, covariance=estr.variance,
>>>                          X=[[1, 0]],  # Intercept-only
>>>                          distribution='weibull', measure='risk')

We can do the same process for those with a positive stain by switching out the design matrix

>>> aft_predictions_function(times=50., theta=estr.theta, covariance=estr.variance,
>>>                          X=[[1, 1]],  # Intercept and positive stain
>>>                          distribution='weibull', measure='risk')

Now, we will use these predictions to plot the risk function over the time period. We generate a vector of times for the plot. These should be chosen ‘densely’, so the plot appears smooth

>>> # Time steps to generate predictions for
>>> times = np.linspace(0.01, 230, 100)
>>> # Generating predictions
>>> s0_hat = survival_predictions(times=times, theta=estr.theta, covariance=estr.variance,
>>>                               X=[[1, 0]],  # Intercept only
>>>                               distribution='weibull', measure='risk')
>>> s1_hat = survival_predictions(times=times, theta=estr.theta, covariance=estr.variance,
>>>                               X=[[1, 1]],  # Intercept and positive stain
>>>                               distribution='weibull', measure='risk')
>>> # Plot
>>> plt.fill_between(times, s1_hat[:, 2], s1_hat[:, 3], color='blue', alpha=0.3)
>>> plt.fill_between(times, s0_hat[:, 2], s0_hat[:, 3], color='red', alpha=0.3)
>>> plt.plot(times, s1_hat[:, 0], '-', color='blue', alpha=0.3, label='Pos')
>>> plt.plot(times, s0_hat[:, 0], '-', color='red', alpha=0.3, label='Neg')
>>> plt.xlabel("Time")
>>> plt.ylabel("Risk")
>>> plt.legend()
>>> plt.show()

Here, the fill_between displays the point-wise 95% confidence intervals. Other survival measures can be requested through the optional measure argument.

References

Collett D. (2015). Accelerated failure time and other parametric models. In: Modelling survival data in medical research. CRC press. pg 242