delicatessen.sandwich.compute_sandwich
- compute_sandwich(stacked_equations, theta, deriv_method='approx', dx=1e-09, allow_pinv=True)
Compute the empirical sandwich variance estimator given a set of estimating equations and parameter estimates. Note that this functionality does not solve for the parameter estimates (unlike
MEstimator
). Instead, it only computes the sandwich for the provided value.The empirical sandwich variance estimator is defined as
\[V_n(O_i; \theta) = B_n(O_i; \theta)^{-1} F_n(O_i; \theta) \left[ B_n(O_i; \theta)^{-1} \right]^{T}\]where \(\psi(O_i; \theta)\) is the estimating function,
\[B_n(O_i; \theta) = \sum_{i=1}^n \frac{\partial}{\partial \theta} \psi(O_i; \theta),\]and
\[F_n(O_i; \theta) = \sum_{i=1}^n \psi(O_i; \theta) \psi(O_i; \theta)^T .\]To compute the bread matrix, \(B_n\), the matrix of partial derivatives is computed by using either finite difference methods or automatic differentiation. For finite differences, the default is to use SciPy’s
approx_fprime
functionality, which uses forward finite differences. However, you can also use the delicatessen homebrew version that allows for forward, backward, and center differences. Automatic differentiation is also supported by a homebrew version.To compute the meat matrix, \(F_n\), only linear algebra methods, implemented through NumPy, are necessary. The sandwich is then constructed from these pieces using linear algebra methods from NumPy.
- Parameters
stacked_equations (function, callable) – Function that returns a v-by-n NumPy array of the estimating equations. See provided examples in the documentation for how to construct a set of estimating equations.
theta (list, set, array) – Parameter estimates to compute the empirical sandwich variance estimator at. Note that this function assumes that you have solved for the
theta
that correspond to the root of the input estimating equations.deriv_method (str, optional) – Method to compute the derivative of the estimating equations for the bread matrix. Options include numerical approximation via the forward difference method via SciPy (
'approx'
), forward difference implemented by-hand (‘fapprox’), backward difference implemented by-hand (‘bapprox’), central difference implemented by-hand (‘capprox’), or forward-mode automatic differentiation ('exact'
). Default is'approx'
.dx (float, optional) – Spacing to use to numerically approximate the partial derivatives of the bread matrix. Here, a small value for
dx
should be used, since some large values can result in poor approximations. This argument is only used with numerical approximation methods. Default is1e-9
.allow_pinv (bool, optional) – Whether to allow for the pseudo-inverse (via
numpy.linalg.pinv
) if the bread matrix is determined to be non-invertible. If you want to disallow the pseudo-inverse (i.e., usenumpy.linalg.inv
), set this argument toFalse
. Default isTrue
, which is more robust to the possible bread matrices.
- Returns
Returns a p-by-p NumPy array for the input
theta
, wherep = len(theta)
- Return type
array
Examples
Loading necessary functions and building a generic data set for estimation of the mean
>>> import numpy as np >>> from delicatessen import MEstimator >>> from delicatessen import compute_sandwich >>> from delicatessen.estimating_equations import ee_mean_variance
>>> y_dat = [1, 2, 4, 1, 2, 3, 1, 5, 2]
The following is an illustration of how to compute sandwich covariance using only an estimating equation and the parameter values. The mean and variance (that correspond to
ee_mean_variance
) can be computed using NumPy by>>> mean = np.mean(y_dat) >>> var = np.var(y_dat, ddof=0)
For the corresponding estimating equation, we can use the built-in functionality as done below
>>> def psi(theta): >>> return ee_mean_variance(theta=theta, y=y_dat)
Calling the sandwich computation procedure
>>> sandwich_asymp = compute_sandwich(stacked_equations=psi, theta=[mean, var])
The output sandwich is the asymptotic variance (or the variance that corresponds to the standard deviation). To get the variance (or the variance that corresponds to the standard error), we rescale
sandwich
by the number of observations>>> sandwich = sandwich_asymp / len(y_dat)
The standard errors are then
>>> se = np.sqrt(np.diag(sandwich))
References
Boos DD, & Stefanski LA. (2013). M-estimation (estimating equations). In Essential Statistical Inference (pp. 297-337). Springer, New York, NY.
Ross RK, Zivich PN, Stringer JSA, & Cole SR. (2024). M-estimation for common epidemiological measures: introduction and applied examples. International Journal of Epidemiology, 53(2), dyae030.
Stefanski LA, & Boos DD. (2002). The calculus of M-estimation. The American Statistician, 56(1), 29-38.