delicatessen.utilities.spline
- spline(variable, knots, power=3, restricted=True, normalized=False)
Generate generic polynomial spline terms for a given NumPy array and pre-specified knots. Default is restricted cubic splines but unrestricted splines to different polynomial terms can also be specified.
Unrestricted splines for knot \(k\) are generated using the following formula
\[s_k(X) = I(X > k) \left\{ X - k \right\}^a\]where \(a\) is the power (\(a=3\) for cubic splines).
Restricted splines are generated via
\[r_k(X) = I(X > k) \left\{ X - k \right\}^a - s_K(X)\]where \(K\) is largest knot value.
Splines are normalized by the upper knot minus the lower knot to the corresponding power. Normalizing the splines can be helpful for the root-finding procedure.
- Parameters
variable (ndarray, vector, list) – 1-dimensional vector of observed values. Input should consists of the variable to generate spline terms for
knots (ndarray, vector, list) – 1-dimensional vector of pre-specified knot locations. All knots should be between the minimum and maximum values of the input variable
power (int, float, optional) – Power or polynomial term to use for the splines. Default is 3, which corresponds to cubic splines
restricted (bool, optional) – Whether to generate restricted or unrestricted splines. Default is True, which corresponds to restricted splines. Restricted splines return one less column than the number of knots, whereas unrestricted splines return the same number of columns as knots
normalized (bool, optional) – Whether to normalize, or divide, the spline terms by the difference between the upper and lower knots. Default is
False, which corresponds to unnormalized splines.
- Returns
A 2-dimensional array of the spline terms in ascending order of the knots.
- Return type
ndarray
Examples
Construction of spline variables should be done similar to the following
>>> import numpy as np >>> import pandas as pd >>> from delicatessen.utilities import spline
Some generic data to estimate a generalized additive model
>>> x = np.random.normal(size=200)
A restricted quadratic spline with 3 knots (at -1, 0, 1) can be generated using the following function call
>>> spline(variable=x, knots=[-1, 0, 1], power=2, restricted=True)
This function will return a 2 by 200 array here. Other knot specifications, other powers, and unrestricted splines can also be generated by updating the corresponding arguments.
References
Mulla ZD (2007). Spline regression in clinical research. West Indian Med J, 56(1), 77.