delicatessen.estimating_equations.regression.ee_beta_regression
- ee_beta_regression(theta, X, y, weights=None, offset=None)
Estimating equation for a beta regression model. This estimating equation functionality supports outcome data, bounded within \((0,1)\). Here, the mean–precision parameterization of beta regression is used, with the parameters for the beta distribution defined as
\[\alpha = \mu \phi, \qquad \beta = (1 - \mu) \phi, \qquad \phi > 0\]where \(\mu = g^{-1}(X_i \eta^T)\) is the regression model and \(g^{-1}\) is the inverse link function. The corresponding estimating equation for beta regression are
\[\begin{split}\sum_{i=1}^n \begin{bmatrix} \mu (1-\mu) \phi \left\{ \text{logit}(Y) - \dot{\gamma}(\mu \phi) + \dot{\gamma}((1-\mu)\phi)\right\} X_i^T \\ \dot{\gamma}(\phi) - \mu \dot{\gamma}(\mu\phi) + (1-\mu)\dot{\gamma}((1-\mu)\phi) + \mu \log(Y_i) (1-\mu)\log(1-y) \end{bmatrix} = 0\end{split}\]where \(\dot{\gamma}\) denotes the digamma function. Here, \(\theta\) is a 1-by-(b \(+\) 1) array, where b is the distinct covariates included as part of
X. For example, if X is a 3-by-n matrix, then \(\theta\) will be a 1-by-4 array.- Parameters
theta (array) – Theta in this case consists of b`+1 values. Therefore, initial values should consist of the same number as the number of columns present. This can easily be implemented by ``[0., ] * X.shape[1] +[0., ]`.
X (ndarray, list, vector) – 2-dimensional vector of n observed values for b variables.
y (ndarray, list, vector) – 1-dimensional vector of n observed values.
weights (ndarray, list, vector, None, optional) – 1-dimensional vector of n weights. Default is
None, which assigns a weight of 1 to all observations.offset (ndarray, list, vector, None, optional) – A 1-dimensional offset to be included in the model. Default is
None, which applies no offset term.
- Returns
Returns a (b+1)-by-n NumPy array evaluated for the input
theta.- Return type
array
Examples
Construction of an estimating equation(s) with
ee_beta_regressionshould be done similar to the following>>> import numpy as np >>> import pandas as pd >>> from delicatessen import MEstimator >>> from delicatessen.estimating_equations import ee_beta_regression
Some generic data to estimate a beta regression model with
>>> d = pd.DataFrame() >>> d['W'] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1] >>> d['Y'] = [0.1, 0.2, 0.7, 0.11, 0.3, 0.4, 0.65, 0.01, 0.14, 0.9, 0.8, 0.56, 0.99, 0.82] >>> d['C'] = 1
>>> y = d['Y'] >>> X = d[['C', 'W']]
Defining psi, or the stacked estimating equations
>>> def psi(theta): >>> return ee_beta_regression(theta, X=X, y=y)
Calling the M-estimator (note that
initrequires 4 values, sinceX.shape[1]is 3).>>> estr = MEstimator(stacked_equations=psi, init=[0., 0., 0., 0.]) >>> estr.estimate()
Inspecting the parameter estimates, variance, and confidence intervals
>>> estr.theta >>> estr.variance >>> estr.confidence_intervals()
Here, the first three values of
thetacorrespond to the regression and the last value ofthetacorresponds to the precision parameter (on the natural log scale).Weighted beta regression can be implemented by specifying the
weightsargument. An offset can be added by specifying theoffsetargument.References
Ferrari S, & Cribari-Neto F. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31(7), 799-815.