{ "cells": [ { "cell_type": "markdown", "id": "7de50c21-8db9-4ee9-b10a-23c59b006997", "metadata": {}, "source": [ "# Over-Identified Parameters\n", "\n", "Both `MEstimator` and `GMMEstimator` can be used interchangeably for many problems. The key difference between them is how the point estimates are being estimated. `MEstimator` uses a root-finding algorithm to find the approximate zeroes of the estimating equations, whereas `GMMEstimator` takes a matrix product of the estimating equations and searches for the minimum. Broadly, these two approaches are simply two different ways to compute the point estimates. Preference for one or the other in any particular scenario will come down to the problem. For this reason, `MEstimator` could be replaced by `GMMEstimator` in any of the applied examples.\n", "\n", "However, GMM and `GMMEstimator` are also able to address problems where there are more estimating equations than parameters. These parameters are often referred to as *over-identified*. This is in contrast to *just-identified* problems, where there is an equal number of estimating equations and parameters. Due to how the optimization problem is structured, only minimization (and thus `GMMEstimator`) can be used in this scenario. \n", "\n", "Here, we illustrate the use of `GMMEstimator` with over-identified parameters\n", "\n", "## Setup " ] }, { "cell_type": "code", "execution_count": 1, "id": "f3c4fa0b-5074-4651-854b-5b3c018056d0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Versions\n", "NumPy: 2.3.5\n", "SciPy: 1.16.3\n", "pandas: 2.3.3\n", "Delicatessen: 4.1\n" ] } ], "source": [ "import numpy as np\n", "import scipy as sp\n", "import pandas as pd\n", "\n", "import delicatessen as deli\n", "from delicatessen import MEstimator, GMMEstimator\n", "from delicatessen.estimating_equations import ee_regression\n", "from delicatessen.utilities import inverse_logit\n", "\n", "print(\"Versions\")\n", "print(\"NumPy: \", np.__version__)\n", "print(\"SciPy: \", sp.__version__)\n", "print(\"pandas: \", pd.__version__)\n", "print(\"Delicatessen: \", deli.__version__)" ] }, { "cell_type": "markdown", "id": "f6835135-ba94-48ef-a793-c6b2dbd47bff", "metadata": {}, "source": [ "## Instrumental Variable Example 1\n", "\n", "To illustrate use of `GMMEstimator`, we will consider the use of an instrumental variable for the effect of $A$ on $Y$. To coincide with the *over*-identified setting, we will have access to two different instruments. Data for this example are simulated according to the following mechanism" ] }, { "cell_type": "code", "execution_count": 2, "id": "73f1bf89-52d1-42d6-a684-ea96e0832721", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | W | \n", "Z1 | \n", "Z2 | \n", "A | \n", "Y | \n", "
|---|---|---|---|---|---|
| count | \n", "500.000000 | \n", "500.000000 | \n", "500.000000 | \n", "500.000000 | \n", "500.000000 | \n", "
| mean | \n", "0.226000 | \n", "0.033211 | \n", "0.003048 | \n", "0.068616 | \n", "0.137321 | \n", "
| std | \n", "0.418658 | \n", "0.517523 | \n", "0.523425 | \n", "1.216829 | \n", "2.408356 | \n", "
| min | \n", "0.000000 | \n", "-1.228100 | \n", "-1.545184 | \n", "-4.180748 | \n", "-8.943065 | \n", "
| 25% | \n", "0.000000 | \n", "-0.355530 | \n", "-0.326217 | \n", "-0.680898 | \n", "-1.353907 | \n", "
| 50% | \n", "0.000000 | \n", "0.033815 | \n", "0.016186 | \n", "0.083086 | \n", "0.189151 | \n", "
| 75% | \n", "0.000000 | \n", "0.367397 | \n", "0.348956 | \n", "0.871490 | \n", "1.644282 | \n", "
| max | \n", "1.000000 | \n", "1.732386 | \n", "2.135354 | \n", "3.534259 | \n", "6.914683 | \n", "