class: center, middle, inverse, title-slide .title[ # CSSS/POLS 512 ] .subtitle[ ## Lab 6: Nickell Bias and GMM Dynamic Panel Estimators ] .author[ ### Ramses Llobet ] .date[ ### Spring 2026 ] --- # Preview .pull-left[ **Dynamic panel estimators** - Nickell bias and where it comes from - The IV/GMM fix, by hand - `pgmm()`: formula syntax and arguments - The five diagnostic tests on every printout ] .pull-right[ **Application — taxes and cigarettes** - Diff-GMM on a 48-state × 11-year panel - Counterfactual: a 60-cent tax hike - Three quantities of interest: **EV**, **FD**, **RR** ] --- class: inverse, center, middle # Nickell bias --- # Nickell bias — the mechanism The simplest dynamic panel: `$$y_{it} = \alpha_i + \phi y_{i,t-1} + \beta x_{it} + \varepsilon_{it}, \quad \varepsilon_{it} \sim \text{iid}$$` The **within transformation** subtracts unit means: `\(\tilde y_{it} = y_{it} - \bar y_i\)`. The trouble: `\(\tilde y_{i,t-1}\)` contains `\(\bar y_{i,-1}\)` which depends on `\(\varepsilon_{i,1}, \ldots, \varepsilon_{i,T-1}\)`. The within error `\(\tilde \varepsilon_{it}\)` contains `\(\bar \varepsilon_i\)` which depends on `\(\varepsilon_{i,1}, \ldots, \varepsilon_{i,T}\)`. **They share terms.** -- Closed-form result (Nickell 1981): `$$\boxed{\;\text{plim}(\hat\phi_{\text{FE}} - \phi) \;\approx\; -\frac{1+\phi}{T-1}\;}$$` - **Downward** on `\(\phi\)` — FE understates persistence. - **Order `\(1/T\)`** — small `\(T\)` is bad. - **$N \to \infty$ does not save you** — only more time periods do. --- # The fix — first differences plus an instrument Difference both sides; `\(\alpha_i\)` vanishes (`\(\Delta \alpha_i = 0\)`): `$$\Delta y_{it} = \phi \Delta y_{i,t-1} + \beta \Delta x_{it} + \Delta \varepsilon_{it}$$` But `\(\Delta y_{i,t-1}\)` and `\(\Delta \varepsilon_{it}\)` both contain `\(\varepsilon_{i,t-1}\)` — **need an instrument**. -- **$y_{i,t-2}$** is the natural choice (Anderson-Hsiao 1981): - **Relevant**: `\(\Delta y_{i,t-1} = y_{i,t-1} - y_{i,t-2}\)` contains `\(y_{i,t-2}\)`. ✓ - **Exogenous**: `\(y_{i,t-2}\)` depends on `\(\varepsilon\)` only up to `\(t-2\)`, so it is uncorrelated with `\(\Delta \varepsilon_{it} = \varepsilon_{it} - \varepsilon_{i,t-1}\)`. ✓ Closed-form IV (= one-step GMM) estimator: `$$\hat\theta = \big(Z'X\big)^{-1} Z' \Delta y$$` with `\(Z = [y_{i,t-2}, \Delta x_{it}]\)` and `\(X = [\Delta y_{i,t-1}, \Delta x_{it}]\)`. **One matrix solve** — the lab works through this from scratch. --- # The family of GMM estimators .pull-left[ | Estimator | Instruments for `\(\Delta y_{i,t-1}\)` | |:---|:---| | **Anderson-Hsiao (1981)** | just `\(y_{i,t-2}\)` | | **Arellano-Bond (1991)**<br>*Difference GMM* | full sequence `\(y_{i,t-2}, y_{i,t-3}, \ldots\)` | | **Blundell-Bond (1998)**<br>*System GMM* | + lagged differences for a level equation | ] .pull-right[ **Plus a likelihood outsider:** | Estimator | Idea | |:---|:---| | **OPM** (Pickup et al. 2017) | Marginalize `\(\alpha_i\)` via orthogonal reparameterization — no instruments. | **When to reach for what:** - Diff-GMM: standard default. - Sys-GMM: `\(\phi\)` near 1, or time-invariant controls matter. - OPM: small `\(T\)` (< 8) where GMM is unreliable. ] --- class: inverse, center, middle # `pgmm()` and its diagnostic battery --- # Anatomy of a `pgmm()` formula The formula uses up to three blocks separated by `|`: ``` y ~ <regressors> | <GMM-style instruments> | <regular instruments (optional)> ``` - **Left of the first `|`**: the regression equation in *levels*. `lag(y, 1)` is the lagged DV; `lag(x, 0:1)` adds both the contemporaneous and lag-1 versions of `x`. - **First `|` block**: GMM-style instruments — `lag(y, 2:99)` stacks **one moment per period × lag**. This is where over-identification lives. - **Second `|` block** (optional): regular IV-style instruments — one stacked moment per variable, no per-period expansion. -- **Key arguments:** | Argument | Choices | What it does | |:---|:---|:---| | `data` | a `pdata.frame` | Panel with `index = c(unit, time)` | | `effect` | `"individual"`, `"twoways"`, `"time"` | Adds unit / period / both FE | | `model` | `"onestep"`, `"twosteps"` | One-step vs two-step weighting; two-step needs Windmeijer SEs | | `transformation` | `"d"`, `"ld"` | Difference GMM vs system GMM (levels + differences) | --- # A canonical Diff-GMM fit on `EmplUK` ```r fit_demo <- pgmm( log(emp) ~ lag(log(emp), 1:2) + lag(log(wage), 0:1) + log(capital) + lag(log(output), 0:1) | lag(log(emp), 2:4), # GMM-style instruments data = EmplUK, effect = "twoways", # unit + time FE model = "twosteps", transformation = "d" ) summary(fit_demo, robust = TRUE) # Windmeijer SEs ``` -- **`summary.pgmm` prints, in order:** 1. The coefficient table (with Windmeijer-corrected SEs if `robust = TRUE`). 2. **Sargan test** — overidentification. 3. **AR(1) and AR(2) tests** on the differenced residuals. 4. **Wald test for time dummies** (only with `effect = "twoways"`). --- # The five-row diagnostic checklist | Test | `\(H_0\)` | What we want | Note | |:---|:---|:---|:---| | **Sargan / Hansen** | `\(E[Z'\varepsilon] = 0\)` | **Fail to reject** (`\(p > 0.10\)`) | `\(p \approx 1.0\)` is **bad** (too many instruments) | | **AR(1) on `\(\Delta\hat\varepsilon\)`** | no first-order serial corr | **Reject** (`\(p < 0.05\)`) | Mechanical from differencing | | **AR(2) on `\(\Delta\hat\varepsilon\)`** | no second-order serial corr | **Fail to reject** (`\(p > 0.05\)`) | Rejection invalidates lag-2 IV | | **Wald on `\(\tau_t\)`** | all year dummies = 0 | Reject → keep year FE | Only with `effect = "twoways"` | | **# instruments** | (rule of thumb) | `\(<\,N\)` (ideally `\(\ll N\)`) | Roodman 2009 | -- **What to do if a test fails:** - Sargan rejects → tighten lag depth (start instruments from lag 3) or add more LDV lags. - AR(2) rejects → same; the levels error is not iid. - Too many instruments → cap the lag range (`lag(y, 2:4)` instead of `2:99`). --- class: inverse, center, middle # Application — taxes and cigarettes --- # The cigarette panel 48 U.S. states, 1985–1995 (balanced, `\(T = 11\)`). Variables we need: | Variable | Description | |:---|:---| | `packpc` | packs/capita (outcome) | | `income`, `pop` | for per-capita income | | `tax`, `taxs`, `avgprs` | excise tax, total tax, average price (cents/pack) | | `cpi` | for inflation-adjusting to 1995 dollars | -- **The substantive question.** By how much would a tax hike reduce cigarette consumption — and how does the effect unfold over time? We fit Diff-GMM on the dynamic specification `$$\text{packpc}_{it} = \alpha_i + \phi \, \text{packpc}_{i,t-1} + \beta_1 \, \text{income95pc}_{it} + \beta_2 \, \text{avgprs95}_{it} + \varepsilon_{it}$$` and forecast a 3-year trajectory under a 60-cent tax shock. --- # The Diff-GMM fit on cigarette consumption ```r fit_cig <- pgmm( packpc ~ lag(packpc, 1) + income95pc + avgprs95 | lag(packpc, 2:99), data = pdat_cig, # pdata.frame with index c("state","year") effect = "individual", # state FE; no year dummies (small T) model = "twosteps", # optimal two-step weighting transformation = "d" # difference GMM ) ``` -- Table: Diff-GMM on packpc with Windmeijer-corrected SEs. |term | Estimate| Std. Error| z-value| Pr(>|z|)| |:--------------|--------:|----------:|-------:|------------------:| |lag(packpc, 1) | 0.639| 0.055| 11.552| 0.000| |income95pc | -0.479| 0.496| -0.965| 0.334| |avgprs95 | -0.180| 0.028| -6.404| 0.000| **Read off**: persistence `\(\hat\phi\)` positive and below 1; **higher price → fewer packs** (`\(\hat\beta_{\text{avgprs}}\)` negative); income coefficient small. --- # Diagnostics for `fit_cig` Table: Diagnostic battery on the cigarette Diff-GMM fit. |test | statistic| df_or_N| p_value|decision | |:-----------------|---------:|-------:|-------:|:------------------------------| |Sargan | 47.099| 44| 0.347|want fail to reject (p > 0.10) | |AR(1) on Δresid | -3.444| NA| 0.001|want reject (p < 0.05) | |AR(2) on Δresid | -0.537| NA| 0.592|want fail to reject (p > 0.05) | |# instruments / N | 47.000| 48| NA|want ratio < 1 | -- **Reading the table.** The Sargan p-value sits near 1 because we used the full `lag(packpc, 2:99)` instrument set on a `\(N = 48\)` panel — the **Roodman pathology** (§2.5 in the lab). For a clean Sargan, cap at `lag(packpc, 2:4)`. --- # Counterfactual: a 60-cent tax shock ```r periods_out <- 3 # Sample 1000 parameter draws from MVN(coef, Windmeijer vcov) simparam <- mvrnorm(n = 1000, mu = coefficients(fit_cig), Sigma = vcovHC(fit_cig)) simphi <- simparam[, 1] simbeta <- simparam[, -1, drop = FALSE] # Treatment: +60 cent change in avgprs95 at period 1, then sustained xhyp <- cfMake(packpc ~ income95pc + avgprs95 - 1, data = pdat_cig, nscen = periods_out) xhyp$x <- 0 * xhyp$x; xhyp$xpre <- 0 * xhyp$xpre xhyp <- cfChange(xhyp, "avgprs95", x = 60, scen = 1) # Baseline: zero change xbase <- xhyp; xbase$x <- xbase$xpre # Three simulators: EV (level), FD (treat - base), RR (% change) sev_treat <- ldvsimev(xhyp, b = simbeta, phi = simphi, lagY = lagY, transform = "diff", initialY = initialY, ...) sev_base <- ldvsimev(xbase, ...) sfd <- ldvsimfd(xhyp, ...) srr <- ldvsimrr(xhyp, ...) ``` --- # The forecast: 3 panels <div class="figure" style="text-align: center"> <img src="Lab6_slides_files/figure-html/p3-plot-1.svg" alt="Counterfactual forecast of a 60-cent tax hike. EV: predicted packs/capita under hike (solid) vs baseline (dashed). FD: absolute difference. RR: percent change. Bands are 95% intervals from 1000 draws." width="864" /> <p class="caption">Counterfactual forecast of a 60-cent tax hike. EV: predicted packs/capita under hike (solid) vs baseline (dashed). FD: absolute difference. RR: percent change. Bands are 95% intervals from 1000 draws.</p> </div> **Reading the panels.** Higher price → fewer packs. The effect builds via the LDV: year 1 is the immediate response; later years carry forward through `\(\hat\phi\)`. --- # Takeaways .pull-left[ **Methodologically** 1. FE-LDV is **biased downward** when `\(T\)` is small — Nickell formula gives the leading-order term. 2. The IV/GMM family kills `\(\alpha_i\)` by differencing and instruments past the residual endogeneity. 3. **`pgmm()`** packages this for you — but check the **five-row diagnostic table** every time. 4. **Too many instruments** ruins Sargan and inflates precision (Roodman 2009). ] .pull-right[ **Substantively (cigarettes)** 1. A 60-cent tax hike produces a **measurable, persistent drop** in packs/capita. 2. The dynamic structure delivers the effect over **multiple periods**, not just at impact. 3. `simcf` lets us express the same effect as a **level forecast**, a **first difference**, or a **percent change** — pick the metric the audience needs. ] --- # References - Anderson & Hsiao (1981) — IV in dynamic models with error components - Arellano & Bond (1991) — Difference GMM - Blundell & Bond (1998) — System GMM - Nickell (1981) — bias derivation - Roodman (2009a, b) — practical guide; "Too Many Instruments" - Pickup et al. (2017); Pickup & Hopkins (2022) — orthogonal-panel model - Windmeijer (2005) — finite-sample SE correction --- class: inverse, center, middle # Let's get started! Open `Lab6.Rmd`. `rllobet@uw.edu`