{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "7de50c21-8db9-4ee9-b10a-23c59b006997",
   "metadata": {},
   "source": [
    "# GMM for Over-Identified Parameters\n",
    "\n",
    "Both `MEstimator` and `GMMEstimator` can be used interchangeably for many problems. The key difference between them is how the point estimates are being estimated. `MEstimator` uses a root-finding algorithm to find the approximate zeroes of the estimating equations, whereas `GMMEstimator` takes a matrix product of the estimating equations and searches for the minimum. Broadly, these two approaches are simply two different ways to compute the point estimates. Preference for one or the other in any particular scenario will come down to the problem. For this reason, `MEstimator` could be replaced by `GMMEstimator` in any of the applied examples.\n",
    "\n",
    "However, GMM and `GMMEstimator` are also able to address problems where there are more estimating equations than parameters. These parameters are often referred to as *over-identified*. This is in contrast to *just-identified* problems, where there is an equal number of estimating equations and parameters. Due to how the optimization problem is structured, only minimization (and thus `GMMEstimator`) can be used in this scenario. \n",
    "\n",
    "Here, we illustrate the use of `GMMEstimator` with over-identified parameters\n",
    "\n",
    "## Setup "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "f3c4fa0b-5074-4651-854b-5b3c018056d0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Versions\n",
      "NumPy:         2.3.5\n",
      "SciPy:         1.16.3\n",
      "pandas:        2.3.3\n",
      "Delicatessen:  4.1\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import scipy as sp\n",
    "import pandas as pd\n",
    "\n",
    "import delicatessen as deli\n",
    "from delicatessen import MEstimator, GMMEstimator\n",
    "from delicatessen.estimating_equations import ee_regression\n",
    "from delicatessen.utilities import inverse_logit\n",
    "\n",
    "print(\"Versions\")\n",
    "print(\"NumPy:        \", np.__version__)\n",
    "print(\"SciPy:        \", sp.__version__)\n",
    "print(\"pandas:       \", pd.__version__)\n",
    "print(\"Delicatessen: \", deli.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6835135-ba94-48ef-a793-c6b2dbd47bff",
   "metadata": {},
   "source": [
    "## Instrumental Variable Example 1\n",
    "\n",
    "To illustrate use of `GMMEstimator`, we will consider the use of an instrumental variable for the effect of $A$ on $Y$. To coincide with the *over*-identified setting, we will have access to two different instruments. Data for this example are simulated according to the following mechanism"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "73f1bf89-52d1-42d6-a684-ea96e0832721",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>W</th>\n",
       "      <th>Z1</th>\n",
       "      <th>Z2</th>\n",
       "      <th>A</th>\n",
       "      <th>Y</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>500.000000</td>\n",
       "      <td>500.000000</td>\n",
       "      <td>500.000000</td>\n",
       "      <td>500.000000</td>\n",
       "      <td>500.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>0.226000</td>\n",
       "      <td>0.033211</td>\n",
       "      <td>0.003048</td>\n",
       "      <td>0.068616</td>\n",
       "      <td>0.137321</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>0.418658</td>\n",
       "      <td>0.517523</td>\n",
       "      <td>0.523425</td>\n",
       "      <td>1.216829</td>\n",
       "      <td>2.408356</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>-1.228100</td>\n",
       "      <td>-1.545184</td>\n",
       "      <td>-4.180748</td>\n",
       "      <td>-8.943065</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>-0.355530</td>\n",
       "      <td>-0.326217</td>\n",
       "      <td>-0.680898</td>\n",
       "      <td>-1.353907</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.033815</td>\n",
       "      <td>0.016186</td>\n",
       "      <td>0.083086</td>\n",
       "      <td>0.189151</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.367397</td>\n",
       "      <td>0.348956</td>\n",
       "      <td>0.871490</td>\n",
       "      <td>1.644282</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.732386</td>\n",
       "      <td>2.135354</td>\n",
       "      <td>3.534259</td>\n",
       "      <td>6.914683</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                W          Z1          Z2           A           Y\n",
       "count  500.000000  500.000000  500.000000  500.000000  500.000000\n",
       "mean     0.226000    0.033211    0.003048    0.068616    0.137321\n",
       "std      0.418658    0.517523    0.523425    1.216829    2.408356\n",
       "min      0.000000   -1.228100   -1.545184   -4.180748   -8.943065\n",
       "25%      0.000000   -0.355530   -0.326217   -0.680898   -1.353907\n",
       "50%      0.000000    0.033815    0.016186    0.083086    0.189151\n",
       "75%      0.000000    0.367397    0.348956    0.871490    1.644282\n",
       "max      1.000000    1.732386    2.135354    3.534259    6.914683"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Set up for dgm\n",
    "np.random.seed(777)\n",
    "n = 500\n",
    "\n",
    "d = pd.DataFrame()\n",
    "d['W'] = np.random.binomial(n=1, p=0.25, size=n)\n",
    "d['Z1'] = np.random.normal(scale=0.5, size=n)\n",
    "d['Z2'] = np.random.normal(scale=0.5, size=n)\n",
    "d['A'] = d['Z1'] + d['Z2'] + np.random.normal(size=n)\n",
    "d['Y'] = 2*d['A'] - 1*d['W']*d['A'] + np.random.normal(scale=1.0, size=n)\n",
    "d.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1d495c9-86bf-48af-bd64-96c9d30cf0fa",
   "metadata": {},
   "source": [
    "For this instrumental variable analysis, we use the following estimating equation\n",
    "$$ E[Z(Y - \\beta_a A)] = 0 $$\n",
    "where $Z$ is an instrument. In our case, there are two instruments available $Z_1,Z_2$. We might consider applying this previous estimating equation for each instrument separate. The following is code that does this"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "5162b6ad-99e7-485f-9589-bff0dd4221d0",
   "metadata": {},
   "outputs": [],
   "source": [
    "z1 = np.asarray(d['Z1'])\n",
    "z2 = np.asarray(d['Z2'])\n",
    "a = np.asarray(d['A'])\n",
    "y = np.asarray(d['Y'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "cb51ee36-ef54-4aed-a51a-6ba3d0aabd1b",
   "metadata": {},
   "outputs": [],
   "source": [
    "def psi(theta):\n",
    "    ee_z1 = z1 * (y - theta[0]*a)\n",
    "    ee_z2 = z2 * (y - theta[1]*a)\n",
    "    return np.vstack([ee_z1, ee_z2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "de604e0c-f4d6-433e-8e11-3666267d0416",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==============================================================\n",
      "              Estimation Method: M-estimator\n",
      "--------------------------------------------------------------\n",
      "No. Observations:         500 | No. Parameters:              2\n",
      "Solving algorithm:         lm | Max Iterations:           5000\n",
      "Solving tolerance:      1e-09 | Allow P-Inverse:             1\n",
      "Derivative Method:     approx | Deriv Approx:            1e-09\n",
      "Small N Correction:      None | Distribution:           Z-stat\n",
      "==============================================================\n",
      "   Theta   StdErr  Z-score      LCL      UCL  P-value  S-value \n",
      "--------------------------------------------------------------\n",
      "    1.76     0.11    15.84     1.54     1.98     0.00   185.42 \n",
      "    1.86     0.10    19.57     1.68     2.05     0.00   280.76 \n",
      "==============================================================\n"
     ]
    }
   ],
   "source": [
    "estr = MEstimator(psi, init=[0., 0.])\n",
    "estr.estimate()\n",
    "estr.print_results()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a21c95db-eefd-42f5-806f-bcf84176e3f8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==============================================================\n",
      "              Estimation Method: GMM-estimator\n",
      "--------------------------------------------------------------\n",
      "No. Observations:         500 | No. Parameters:              2\n",
      "Solving algorithm: nelder-mead | Max Iterations:           5000\n",
      "Solving tolerance:      1e-09 | Allow P-Inverse:             1\n",
      "Derivative Method:     approx | Deriv Approx:            1e-09\n",
      "Small N Correction:      None | Distribution:           Z-stat\n",
      "==============================================================\n",
      "   Theta   StdErr  Z-score      LCL      UCL  P-value  S-value \n",
      "--------------------------------------------------------------\n",
      "    1.76     0.11    15.84     1.54     1.98     0.00   185.42 \n",
      "    1.86     0.10    19.57     1.68     2.05     0.00   280.76 \n",
      "==============================================================\n"
     ]
    }
   ],
   "source": [
    "estr = GMMEstimator(psi, init=[0., 0.])\n",
    "estr.estimate(solver='nelder-mead')\n",
    "estr.print_results()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bb7165ac-1b25-4f57-b022-e8fabd0e1bd3",
   "metadata": {},
   "source": [
    "This analysis (whether done via `MEstimator` or `GMMEstimator`) gives us two different estimates. These two estimates are close to each other (given the precision, as indicated by the confidence or compability intervals), but how do we select which one to report? Well we could report both or try some sort of inverse-variance weighting of the two, but we could also revise our estimating equations to correspond to an *over*-identified setting. \n",
    "\n",
    "Here, our stacked over identified estimating equations are \n",
    "$$ E\n",
    "\\begin{bmatrix}\n",
    "Z_1(Y - \\beta A) \\\\\n",
    "Z_2(Y - \\beta A) \\\\\n",
    "\\end{bmatrix}\n",
    "= 0 \n",
    "$$\n",
    "Note that there is only one parameter, $\\beta$ being estimated, but we have two separate equations. \n",
    "\n",
    "Let's setup this up as an estimating function for `delicatessen`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "b6f47c52-2da4-4430-a442-42e80f7c9f13",
   "metadata": {},
   "outputs": [],
   "source": [
    "def psi(theta):\n",
    "    ee_z1 = z1 * (y - theta*a)\n",
    "    ee_z2 = z2 * (y - theta*a)\n",
    "    return np.vstack([ee_z1, ee_z2])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e1d73f4-a5a1-4a79-b14d-7deea55060b6",
   "metadata": {},
   "source": [
    "Now, let's see what happens when we try to use `MEstimator` for this estimating function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "2b7168b4-0acb-42e4-8c3b-4b42531ea18c",
   "metadata": {},
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "The number of initial values and the number of rows returned by `stacked_equations` should be equal but there are 1 initial values and the `stacked_equations` function returns 2 row(s).",
     "output_type": "error",
     "traceback": [
      "\u001b[31m---------------------------------------------------------------------------\u001b[39m",
      "\u001b[31mValueError\u001b[39m                                Traceback (most recent call last)",
      "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[8]\u001b[39m\u001b[32m, line 2\u001b[39m\n\u001b[32m      1\u001b[39m estr = MEstimator(psi, init=[\u001b[32m0.\u001b[39m,])\n\u001b[32m----> \u001b[39m\u001b[32m2\u001b[39m \u001b[43mestr\u001b[49m\u001b[43m.\u001b[49m\u001b[43mestimate\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n",
      "\u001b[36mFile \u001b[39m\u001b[32m~\\Documents\\open-source\\Delicatessen\\delicatessen\\estimation.py:683\u001b[39m, in \u001b[36mMEstimator.estimate\u001b[39m\u001b[34m(self, solver, maxiter, tolerance, deriv_method, dx, allow_pinv)\u001b[39m\n\u001b[32m    679\u001b[39m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\u001b[33m\"\u001b[39m\u001b[33mThe number of initial values and the number of rows returned by `stacked_equations` \u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m    680\u001b[39m                      \u001b[33m\"\u001b[39m\u001b[33mshould be equal but there are \u001b[39m\u001b[33m\"\u001b[39m + \u001b[38;5;28mstr\u001b[39m(np.asarray(\u001b[38;5;28mself\u001b[39m.init).shape[\u001b[32m0\u001b[39m]) + \u001b[33m\"\u001b[39m\u001b[33m initial values \u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m    681\u001b[39m                      \u001b[33m\"\u001b[39m\u001b[33mand the `stacked_equations` function returns \u001b[39m\u001b[33m\"\u001b[39m + \u001b[38;5;28mstr\u001b[39m(\u001b[32m1\u001b[39m) + \u001b[33m\"\u001b[39m\u001b[33m row.\u001b[39m\u001b[33m\"\u001b[39m)\n\u001b[32m    682\u001b[39m \u001b[38;5;28;01melif\u001b[39;00m np.asarray(\u001b[38;5;28mself\u001b[39m.init).shape[\u001b[32m0\u001b[39m] != vals_at_init.shape[\u001b[32m1\u001b[39m]:\n\u001b[32m--> \u001b[39m\u001b[32m683\u001b[39m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\u001b[33m\"\u001b[39m\u001b[33mThe number of initial values and the number of rows returned by `stacked_equations` \u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m    684\u001b[39m                      \u001b[33m\"\u001b[39m\u001b[33mshould be equal but there are \u001b[39m\u001b[33m\"\u001b[39m + \u001b[38;5;28mstr\u001b[39m(np.asarray(\u001b[38;5;28mself\u001b[39m.init).shape[\u001b[32m0\u001b[39m]) + \u001b[33m\"\u001b[39m\u001b[33m initial values \u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m    685\u001b[39m                      \u001b[33m\"\u001b[39m\u001b[33mand the `stacked_equations` function returns \u001b[39m\u001b[33m\"\u001b[39m + \u001b[38;5;28mstr\u001b[39m(vals_at_init.shape[\u001b[32m1\u001b[39m])\n\u001b[32m    686\u001b[39m                      + \u001b[33m\"\u001b[39m\u001b[33m row(s).\u001b[39m\u001b[33m\"\u001b[39m)\n\u001b[32m    687\u001b[39m \u001b[38;5;28;01melif\u001b[39;00m vals_at_init.ndim > \u001b[32m2\u001b[39m:\n\u001b[32m    688\u001b[39m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\u001b[33m\"\u001b[39m\u001b[33mA 2-dimensional array is expected, but the `stacked_equations` returns a \u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m    689\u001b[39m                      + \u001b[38;5;28mstr\u001b[39m(vals_at_init.ndim) + \u001b[33m\"\u001b[39m\u001b[33m-dimensional array.\u001b[39m\u001b[33m\"\u001b[39m)\n",
      "\u001b[31mValueError\u001b[39m: The number of initial values and the number of rows returned by `stacked_equations` should be equal but there are 1 initial values and the `stacked_equations` function returns 2 row(s)."
     ]
    }
   ],
   "source": [
    "estr = MEstimator(psi, init=[0.,])\n",
    "estr.estimate()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ca9ed407-9599-41c7-a2b6-6578b5e3a4ad",
   "metadata": {},
   "source": [
    "We get an error from `delicatessen`. This error states that the dimension of the parameters must match the dimension of the estimating equations. This is because of how the root-finding procedure is structured. We can't use `MEstimator` for this type of problem.\n",
    "\n",
    "However, we are still able to use `GMMEstimator`. Lets's see how that works"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "dbc5901e-16ec-4a97-b6ac-3172494447c6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==============================================================\n",
      "              Estimation Method: GMM-estimator\n",
      "--------------------------------------------------------------\n",
      "No. Observations:         500 | No. Parameters:              1\n",
      "Solving algorithm:       bfgs | Max Iterations:           5000\n",
      "Solving tolerance:      1e-09 | Allow P-Inverse:             1\n",
      "Derivative Method:     approx | Deriv Approx:            1e-09\n",
      "OverID Iteration:          10 | OverID Tolerance:        1e-09\n",
      "Small N Correction:      None | Distribution:           Z-stat\n",
      "==============================================================\n",
      "   Theta   StdErr  Z-score      LCL      UCL  P-value  S-value \n",
      "--------------------------------------------------------------\n",
      "    1.82     0.07    26.07     1.68     1.96     0.00   495.28 \n",
      "==============================================================\n"
     ]
    }
   ],
   "source": [
    "estr = GMMEstimator(psi, init=[0.,])\n",
    "estr.estimate()\n",
    "estr.print_results()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c31c553a-082b-42d5-9b70-06ffc785c3ae",
   "metadata": {},
   "source": [
    "Here, we are able to obtain an estimate. Perhaps more interestingly, our confidence intervals are narrower than the previous implementation that treated instruments as separate. This feature is due to us being able to leverage more information to estimate a single parameter in this second setup (under the assumption that both are valid instruments that are not weak). \n",
    "\n",
    "## Instrumental Variable Example 2\n",
    "\n",
    "To develop a slightly more complicated example, we are going to extend the previous instrumental variable analysis with some transportability methods. Here, interest is in the effect of $A$ on $Y$, except we are interested in the effect a different population. Importantly, we think the only relevant variable differing between our populations is $W$, which as measured in both populations. To transport, we are going to use inverse odds of sampling weights (see Cole et al. applied example for more details). Again, we have the same instruments $Z_1,Z_2$.\n",
    "\n",
    "Below is some simulated data corresponding to this scenario"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "2b440f03-5345-48bd-8c19-8659904eadc3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set up for dgm\n",
    "np.random.seed(777)\n",
    "n = 500\n",
    "\n",
    "# External data\n",
    "d0 = pd.DataFrame()\n",
    "d0['W'] = np.random.binomial(n=1, p=0.25, size=n)\n",
    "d0['Z1'] = np.random.normal(scale=0.5, size=n)\n",
    "d0['Z2'] = np.random.normal(scale=0.5, size=n)\n",
    "d0['A'] = d0['Z1'] + d0['Z2'] + np.random.normal(size=n)\n",
    "d0['Y'] = 2*d0['A'] - 1*d0['W']*d0['A'] + np.random.normal(scale=1.0, size=n)\n",
    "d0['S'] = 0\n",
    "\n",
    "# Target data\n",
    "d1 = pd.DataFrame()\n",
    "d1['W'] = np.random.binomial(n=1, p=0.75, size=n)\n",
    "d1['Z1'] = -99\n",
    "d1['Z2'] = -99\n",
    "d1['A'] = -99\n",
    "d1['Y'] = -99\n",
    "d1['S'] = 1\n",
    "\n",
    "# Stacking data together\n",
    "d = pd.concat([d1, d0], ignore_index=True)\n",
    "d['C'] = 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "5a3c3bf0-cec5-4a45-9dc4-adb2a6d28bd9",
   "metadata": {},
   "outputs": [],
   "source": [
    "W = np.asarray(d[['C', 'W']])\n",
    "z1 = np.asarray(d['Z1'])\n",
    "z2 = np.asarray(d['Z2'])\n",
    "a = np.asarray(d['A'])\n",
    "y = np.asarray(d['Y'])\n",
    "s = np.asarray(d['S'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "de573817-5307-455e-acb9-88feb5f5b41c",
   "metadata": {},
   "source": [
    "For this problem, we are going to estimate inverse odds of sampling weights. To do that, we need to estimate the probability of $S$ (the source population indicator) given $W$. We will do this using logistic regression. Below is an estimating equation that illustrates this process"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "dd56644a-3b95-4ff2-9cb3-d92151af9cab",
   "metadata": {},
   "outputs": [],
   "source": [
    "def psi(theta):\n",
    "    # This is how the inverse odds weights will be computed\n",
    "    # pi_s = inverse_logit(np.dot(W, beta))\n",
    "    # iosw = (1 - s) * pi_s / (1 - pi_s)\n",
    "    return ee_regression(theta=theta, y=s, X=W, model='logistic')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "42d16f70-3385-44bc-a0e9-685b556efc54",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==============================================================\n",
      "              Estimation Method: GMM-estimator\n",
      "--------------------------------------------------------------\n",
      "No. Observations:        1000 | No. Parameters:              2\n",
      "Solving algorithm:       bfgs | Max Iterations:           5000\n",
      "Solving tolerance:      1e-09 | Allow P-Inverse:             1\n",
      "Derivative Method:     approx | Deriv Approx:            1e-09\n",
      "Small N Correction:      None | Distribution:           Z-stat\n",
      "==============================================================\n",
      "   Theta   StdErr  Z-score      LCL      UCL  P-value  S-value \n",
      "--------------------------------------------------------------\n",
      "   -1.08     0.10   -10.72    -1.28    -0.89     0.00    86.60 \n",
      "    2.27     0.15    15.36     1.98     2.56     0.00   174.45 \n",
      "==============================================================\n"
     ]
    }
   ],
   "source": [
    "estr = GMMEstimator(psi, init=[0., 0.])\n",
    "estr.estimate()\n",
    "estr.print_results()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45ab195a-2ade-4031-9b89-f1d6402d3224",
   "metadata": {},
   "source": [
    "To combine these weights with the instrumental variable setup from before, we will use the following estimating equations\n",
    "$$ E\n",
    "\\begin{bmatrix}\n",
    "Z_1(Y - \\beta A) \\pi_s(W) (1-S) \\\\\n",
    "Z_2(Y - \\beta A) \\pi_s(W) (1-S) \\\\\n",
    "\\psi_s(S,W; \\alpha) \\\\\n",
    "\\end{bmatrix}\n",
    "= 0 \n",
    "$$\n",
    "where $\\pi_s(W)$ is the inverse odds weights and $\\psi_s$ is the estimating function for the logistic model of $W$ on $S$. Note that only the external data contributes to the instrumental variable functions (since $A,Y,Z_1,Z_2$ is missing in the target data). Code for these equations is given in the following"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "910ecdac-f94a-43f3-a311-9f4210a1145c",
   "metadata": {},
   "outputs": [],
   "source": [
    "def psi(theta):\n",
    "    beta = theta[0]\n",
    "    alpha = theta[1:]\n",
    "\n",
    "    # Calculating inverse odds of sampling weights\n",
    "    pi_s = inverse_logit(np.dot(W, alpha))\n",
    "    iosw = (1 - s) * pi_s / (1 - pi_s)\n",
    "\n",
    "    # Estimating functions\n",
    "    ee_sample = ee_regression(theta=alpha, y=s, X=W, model='logistic')\n",
    "    ee_z1 = z1 * (y - beta*a) * iosw * (1 - s)\n",
    "    ee_z2 = z2 * (y - beta*a) * iosw * (1 - s)\n",
    "    return np.vstack([ee_z1, ee_z2, ee_sample])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "0ce13805-f893-464c-a3d0-6ae7b40b4872",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==============================================================\n",
      "              Estimation Method: GMM-estimator\n",
      "--------------------------------------------------------------\n",
      "No. Observations:        1000 | No. Parameters:              3\n",
      "Solving algorithm:       bfgs | Max Iterations:           5000\n",
      "Solving tolerance:      1e-09 | Allow P-Inverse:             1\n",
      "Derivative Method:     approx | Deriv Approx:            1e-09\n",
      "OverID Iteration:          10 | OverID Tolerance:        1e-09\n",
      "Small N Correction:      None | Distribution:           Z-stat\n",
      "==============================================================\n",
      "   Theta   StdErr  Z-score      LCL      UCL  P-value  S-value \n",
      "--------------------------------------------------------------\n",
      "    1.36     0.11    12.53     1.15     1.58     0.00   117.16 \n",
      "   -1.08     0.10   -10.72    -1.28    -0.89     0.00    86.59 \n",
      "    2.27     0.15    15.36     1.98     2.56     0.00   174.46 \n",
      "==============================================================\n"
     ]
    }
   ],
   "source": [
    "estr = GMMEstimator(psi, init=[0, 0, 0])\n",
    "estr.estimate()\n",
    "estr.print_results()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25ee8404-77b7-452b-82af-caeeebb2694d",
   "metadata": {},
   "source": [
    "This example highlights how we can combine *over*- and *just*-identified parameters into a joint set of estimating equations with `GMMEstimator`. \n",
    "\n",
    "Note: you may notice that the nuisance model parameters differ. From my understanding, these differences are due to how the weight matrix is updated and the contributions across the different estimating equations. If you dig further into this example, you will also note that the variance estimates differ for the nuisance parameters as well. Again, this seems to be a result of the differing point estimates. If this information is not correct, please reach out to me.\n",
    "\n",
    "This completes the illustration of how `GMMEstimator` can be used for *over* identified parameters."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}