These materials have been revised and updated by successive teaching assistants over recent years.
Revision history:
| Package | Purpose | Key functions |
|---|---|---|
plm |
Panel models, panel unit-root tests, panel SEs | plm(), pdata.frame(), pvar(),
purtest(), vcovHC(), vcovBK(),
vcovSCC(), phtest() |
fixest |
Fast multi-way fixed effects, native cluster SEs | feols(), fixef() |
nlme |
Mixed models with structured error correlations | lme(), corARMA() |
lme4 |
Mixed models (no AR error structure) | lmer() |
parameters |
Within/between decomposition for Mundlak | demean() |
sandwich |
Robust and clustered covariance matrices | vcovCL(), vcovHC() |
lmtest |
Coefficient tests on adjusted SEs | coeftest() |
tseries |
Single-series unit-root tests | adf.test(), pp.test() |
MASS |
Multivariate normal draws for parametric simulation | mvrnorm() |
simcf |
King-Tomz-Wittenberg counterfactual simulators | cfMake(), cfChange(),
ldvsimev(), ldvsimpv() |
knitr |
Include the compiled TikZ DAG image | include_graphics() |
texreg |
Cross-format regression tables | htmlreg(), texreg(),
screenreg() |
tidyverse |
Data manipulation and plotting | ggplot(), tibble(), pipes
(|>) |
| Part | Topic |
|---|---|
| 1 | Panel data: theory, the Przeworski democracy panel, and a dynamics workflow ending in a panel unit-root constellation |
| 2 | A family of panel specifications (Pooled / FE / TWFE / RE / Mundlak-REWB) and their within-vs-between identification |
| 3 | Counterfactual prediction under dynamic fixed effects: one-country forecast and average marginal expectations |
Lab 4 introduced counterfactual forecasting for a single time series. Lab 5 moves the same dynamic logic into panel data. The pedagogical thread, given the short-\(T\) ambiguity in §1.3.4, is interpretation before estimation: identify what variation each model uses, read whether the substantive conclusion is stable, and use fixed-effects counterfactuals only for quantities those models can actually support.
Before any estimator, write the theory. We study the link between
regime type (reg, 1 = democracy) and
economic growth (ln_gdpw) across
countries. The simplified DAG treats oil exporter
status (oil) as a time-invariant confounder and
education (edt) as a slow-moving mediator
on one pathway from democracy to growth.
Three covariate roles:
reg — flips
on coups, pacts, or constitutional transitions; supplies the within-unit
identifying variation.edt
— democracy can expand education, and education can raise growth levels.
Conditioning on edt therefore moves us from a total
regime-growth association toward a more direct association.oil —
binary indicator for major oil exporters; oil can shape both regime type
and GDP levels. It is absorbed by FE/TWFE and estimable in Pooled / RE /
Mundlak.Simplified DAG. Oil confounds democracy and growth; education mediates part of democracy’s association with growth.
What the panel structure buys us:
reg or
edt. Pooling conflates within-country dynamics with
between-country differences.oil is absorbed. If we control for edt, we are
conditioning on a mediator, so the reg coefficient should
be read as closer to a direct association than a total association.reg and edt.oil) but tests RE’s assumption
by splitting every time-varying regressor — including the interaction
reg × edt — into within and between components. If the
between components add nothing, RE is fine; if they do, the within
components carry the FE-style identification.Empirical foreshadow. We will fit \(\ln \text{gdpw} = \alpha + \beta_1\,\text{reg} +
\beta_2\,\text{edt} + \beta_3\,(\text{reg}\!\times\!\text{edt}) +
\beta_4\,\text{oil} + \dots\) in pooled OLS and then compare the
interaction \(\beta_3\) to its FE
counterpart in §2.7–§2.8. The DAG motivates edt as a
mediator; the interaction adds a descriptive heterogeneity question:
does the regime-growth association differ at different education levels,
and does that moderation survive once we strip out cross-country
variation?
# load data
democracy <- read_csv("data/democracy.csv") |>
# lower case
rename_with(tolower) |>
# transform variables
mutate(ln_gdpw = log(gdpw),
country_f = factor(country), # numeric ID → factor for plm
year_f = factor(year))
n_distinct(democracy$country) # N = 135
## [1] 135
n_distinct(democracy$year) # T = 40
## [1] 40
The dataset has two unit identifiers: country (numeric
ID) and ctyname (the actual country name). The panel is
unbalanced: there are 40 unique years in the dataset, but not every
country is observed in every year.
panel_T <- democracy |>
group_by(country, ctyname) |>
summarise(T_i = sum(!is.na(ln_gdpw)), .groups = "drop")
range(panel_T$T_i)
## [1] 6 40
ggplot(panel_T, aes(T_i)) +
geom_histogram(binwidth = 2, fill = "steelblue", alpha = 0.7,
boundary = 0) +
labs(x = "Usable observations per country",
y = "# countries") +
theme_minimal(base_size = 11)
Distribution of usable time periods per country. The panel contains 40 unique years, but many countries have shorter histories.
For the simplified specification, we will use four variables:
ln_gdpw is the outcome: log real GDP
per worker.reg is the treatment: 1
for democracy and 0 for dictatorship in this file’s
coding.edt is the slow-moving covariate
and the moderator: cumulative average years of labor-force
education. The interaction reg × edt lets the regime effect
vary with the workforce’s education level.oil is a time-invariant control:
1 if the country was a major oil exporter (per the
Przeworski codebook), 0 otherwise.A quick sanity check that oil, britcol,
elf60 really are unit-invariant in the data, and
reg/edt really do move within units:
pdata <- pdata.frame(democracy, index = c("country_f", "year_f"))
pvar(pdata)
## no time variation: country ctyname region britcol cath elf60 moslem newc oil country_f
## no individual variation: year edt elf60 year_f
## all NA in time dimension for at least one individual: civlib edt elf60 pollib
## all NA in ind. dimension for at least one time period: civlib edt elf60 pollib
The pvar() summary confirms what the DAG drew: the
time-invariants vary only between, and reg/edt
vary both ways.
We need to settle the dynamics before fitting any FE/RE/Mundlak model — otherwise we are just running OLS on a non-stationary outcome and inviting spurious results.
ggplot(democracy, aes(year, ln_gdpw, group = country, colour = region)) +
geom_line(alpha = 0.5) +
scale_colour_brewer(palette = "Dark2") +
labs(x = NULL, y = "log GDP per worker", colour = NULL) +
theme_minimal(base_size = 11) +
theme(legend.position = "bottom")
Per-country trajectories of log GDP per worker, coloured by region. Most series trend upward with country-specific intercepts and slopes — non-stationary in levels.
Most trajectories trend up; some are flat or volatile. This is the classic mix you will see in macro-panels and a strong prior toward unit roots in levels.
The level series are visibly trended, so raw ACF/PACF plots mostly rediscover the trend. A cleaner diagnostic is to remove a country-specific linear trend first, then inspect the residual serial dependence.
The loop below does two things. First, it saves the de-trended series
in democracy_detrended. Second, it exports a small
diagnostic archive:
output/dynamics/detrended_series/
output/dynamics/acf/
output/dynamics/pacf/
# create subdirectories (if these are not already there)
dir.create("output/dynamics/detrended_series",
recursive = TRUE, showWarnings = FALSE)
dir.create("output/dynamics/acf",
recursive = TRUE, showWarnings = FALSE)
dir.create("output/dynamics/pacf",
recursive = TRUE, showWarnings = FALSE)
# generate vector of unique countries
countries <- sort(unique(democracy$ctyname))
max_lag <- 8 # number of lags for cycles
# generate objects to store
democracy_detrended <- tibble()
acf_all <- tibble()
pacf_all <- tibble()
for (c0 in countries) {
# 1. filter country candidate
df_c <- democracy |>
filter(ctyname == c0) |>
arrange(year)
# 2. Detrend
trend_fit <- lm(ln_gdpw ~ year, data = df_c, na.action = na.exclude)
df_c$ln_gdpw_trend <- fitted(trend_fit)
df_c$ln_gdpw_detrended <- residuals(trend_fit)
democracy_detrended <- bind_rows(democracy_detrended, df_c)
# remove NAs in the trend (if any)
df_c <- df_c |>
filter(!is.na(ln_gdpw_detrended))
# clean the strings, to name the files
file_stub <- c0 |>
str_to_lower() |>
str_replace_all("[^a-z0-9]+", "_") |>
str_replace_all("^_|_$", "")
# 3. plot the series
p_series <-
ggplot(df_c, aes(year, ln_gdpw_detrended)) +
geom_line(colour = "grey30") +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(title = c0,
x = NULL,
y = "De-trended log GDP per worker") +
theme_minimal(base_size = 11)
# export for inspection
ggsave(file.path("output/dynamics/detrended_series",
paste0(file_stub, ".pdf")),
plot = p_series, width = 7, height = 4)
# 4. plot ACF
p_acf <- ggAcf(df_c$ln_gdpw_detrended, lag.max = max_lag) +
labs(title = paste0(c0, ": ACF after de-trending"),
x = "Lag", y = "ACF") +
theme_minimal(base_size = 11)
# export for inspection
ggsave(file.path("output/dynamics/acf",
paste0(file_stub, ".pdf")),
plot = p_acf, width = 7, height = 4)
# 5. plot PACF
p_pacf <- ggPacf(df_c$ln_gdpw_detrended, lag.max = max_lag) +
labs(title = paste0(c0, ": PACF after de-trending"),
x = "Lag", y = "PACF") +
theme_minimal(base_size = 11)
# export for inspection
ggsave(file.path("output/dynamics/pacf",
paste0(file_stub, ".pdf")),
plot = p_pacf, width = 7, height = 4)
# store all acf and pacf values
acf_values <- acf(df_c$ln_gdpw_detrended,
lag.max = max_lag, plot = FALSE)$acf[-1]
pacf_values <- pacf(df_c$ln_gdpw_detrended,
lag.max = max_lag, plot = FALSE)$acf
acf_all <- bind_rows(
acf_all,
tibble(ctyname = c0,
lag = seq_along(acf_values),
value = as.numeric(acf_values))
)
pacf_all <- bind_rows(
pacf_all,
tibble(ctyname = c0,
lag = seq_along(pacf_values),
value = as.numeric(pacf_values))
)
}
# You can look at particular subsets
democracy_detrended |>
filter(ctyname %in% c("USA", "Argentina", "Mexico", "Egypt")) |>
ggplot(aes(year, ln_gdpw_detrended)) +
geom_line(colour = "grey30") +
geom_hline(yintercept = 0, linetype = "dashed") +
facet_wrap(~ ctyname, scales = "free_y") +
labs(x = NULL, y = "De-trended log GDP per worker") +
theme_minimal(base_size = 11)
Illustrative de-trended log GDP per worker series. Each line is the residual from a country-specific linear trend.
The exported files are not meant to be read one by one during lab. They are a reproducible diagnostic archive. In applied work, the pooled summary is useful, but individual units still need inspection because a few unusual panels can drive the average.
We can summarize the ACF/PACF diagnostics across countries by averaging the country-level correlations at each lag. This is descriptive rather than a formal estimator, but it gives a compact view of the typical short-run dynamics after de-trending.
To help read the figure, we add the usual approximate white-noise bounds:
\[ \pm \frac{1.96}{\sqrt{T}}. \]
The logic is that, if a series is white noise, the sample autocorrelation at any fixed lag is approximately normally distributed around zero with standard error \(1/\sqrt{T}\). Multiplying by 1.96 gives the familiar approximate 95% threshold. Here we use the average usable country-level time-series length after de-trending, so the bounds are a visual rule of thumb rather than a formal pooled significance test.
acf_summary <- acf_all |>
group_by(lag) |>
summarise(mean = mean(value, na.rm = TRUE),
lo = quantile(value, 0.25, na.rm = TRUE),
hi = quantile(value, 0.75, na.rm = TRUE),
.groups = "drop")
pacf_summary <- pacf_all |>
group_by(lag) |>
summarise(mean = mean(value, na.rm = TRUE),
lo = quantile(value, 0.25, na.rm = TRUE),
hi = quantile(value, 0.75, na.rm = TRUE),
.groups = "drop")
avg_T <- democracy_detrended |>
filter(!is.na(ln_gdpw_detrended)) |>
count(ctyname) |>
summarise(avg_T = mean(n)) |>
pull(avg_T)
wn_bound <- 1.96 / sqrt(avg_T)
p_acf_pool <- ggplot(acf_summary, aes(lag, mean)) +
geom_hline(yintercept = 0, colour = "grey70") +
geom_hline(yintercept = c(-wn_bound, wn_bound),
linetype = "dashed", colour = "grey45") +
geom_linerange(aes(ymin = lo, ymax = hi), colour = "grey55") +
geom_point(colour = "steelblue", size = 2) +
geom_line(colour = "steelblue") +
scale_x_continuous(breaks = 1:max_lag) +
labs(title = "Average ACF",
subtitle = "Dashed lines: approximate white-noise bounds",
x = "Lag", y = "Correlation") +
theme_minimal(base_size = 11)
p_pacf_pool <- ggplot(pacf_summary, aes(lag, mean)) +
geom_hline(yintercept = 0, colour = "grey70") +
geom_hline(yintercept = c(-wn_bound, wn_bound),
linetype = "dashed", colour = "grey45") +
geom_linerange(aes(ymin = lo, ymax = hi), colour = "grey55") +
geom_point(colour = "firebrick", size = 2) +
geom_line(colour = "firebrick") +
scale_x_continuous(breaks = 1:max_lag) +
labs(title = "Average PACF",
subtitle = "Dashed lines: approximate white-noise bounds",
x = "Lag", y = "Partial correlation") +
theme_minimal(base_size = 11)
p_acf_pool + p_pacf_pool
Average ACF and PACF after removing country-specific linear trends. Points show the mean across countries; vertical bars show the interquartile range.
Finally, we loop over countries and run ADF / PP tests on the
de-trended series. In the tseries implementations we use
here, both tests have the same null:
| Test | Null hypothesis | Default alternative | Reading a large p-value |
|---|---|---|---|
ADF (adf.test) |
unit root / I(1) | stationary / I(0) | fail to reject unit root |
PP (pp.test) |
unit root / I(1) | stationary / I(0) | fail to reject unit root |
The difference is not the null hypothesis. The difference is how the tests handle serial correlation and heteroskedasticity in the errors: ADF adds lagged differences, while PP uses a nonparametric correction to the Dickey-Fuller statistic. Both are power-sensitive. With short country series, a highly persistent but stationary process can easily look too close to a unit root, so failure to reject should be read as no strong evidence against a unit root, not proof that the unit root is true.
The setup below is still cleaner than testing visibly trended levels without first removing the deterministic trend.
# generate object to store test p-values
unit_root_tests <- tibble()
for (c0 in countries) {
# 1. subset each country time series
y <- democracy_detrended |>
filter(ctyname == c0) |>
arrange(year) |>
pull(ln_gdpw_detrended) |>
na.omit()
# 2. test ADF and PP
adf_p <- adf.test(y)$p.value
pp_p <- pp.test(y)$p.value
# 3. save results
unit_root_tests <- bind_rows(
unit_root_tests,
tibble(ctyname = c0,
adf = adf_p,
pp = pp_p)
)
}
# Visualize in a histogram the distribution of -pvalues
unit_root_tests |>
pivot_longer(c(adf, pp),
names_to = "test",
values_to = "p") |>
ggplot(aes(p, fill = test)) +
geom_histogram(bins = 25,
alpha = 0.65,
position = "identity") +
geom_vline(xintercept = 0.05, linetype = "dashed") +
facet_wrap(~ test, ncol = 2) +
labs(title = "De-trended log GDP per worker",
x = "p-value", y = "# countries") +
theme_minimal(base_size = 10) +
theme(legend.position = "none")
ADF and PP p-values after country-specific linear de-trending.
Most countries still have ADF / PP p-values above 0.05 after removing country-specific linear trends. Because both tests have a unit-root null, this means most country series do not provide strong evidence against unit-root-like persistence. Given the short time dimension, the result is also consistent with near-unit-root stationary dynamics.
The de-trended ACF/PACF plots in §1.3.2 and the per-series ADF/PP scan in §1.3.3 are descriptive: each country’s deterministic trend was removed by OLS before the test, so the residuals were clean by construction. They tell us about residual short-run dependence — not whether the level series is itself I(1) or trend-stationary.
The formal panel tests below answer that harder question: are the
level series better described as a random walk (with drift), or as
orbiting a deterministic trend? We feed purtest the
level series of ln_gdpw and let it handle the trend
internally (exo = "trend").
The two competing models. With
exo = "trend", the test pits these two models against each
other:
\(H_0\) — unit root with drift: increments are stationary with mean \(\delta_i\), but the level wanders indefinitely. \[y_{it} = y_{i,t-1} + \delta_i + \varepsilon_{it} \quad\Longleftrightarrow\quad \Delta y_{it} = \delta_i + \varepsilon_{it}\]
\(H_1\) — trend-stationary: the level orbits around a country-specific linear trend; deviations are mean-reverting. \[y_{it} = \mu_i + \delta_i\,t + u_{it}, \qquad u_{it}\text{ stationary I(0)}\]
In \(H_0\), \(\delta_i\) is the per-period drift (the average step the random walk takes each year). In \(H_1\), \(\mu_i\) is the country intercept and \(\delta_i\) is the slope of the deterministic linear trend. The two models can both produce upward-sloping trajectories — the difference is whether shocks fade away (\(H_1\)) or accumulate forever (\(H_0\)). This is the same TS-vs-I(1)-with-drift distinction we drew in Lab 4 §1.2: removing a deterministic trend by OLS does not adjudicate between the two; the unit-root test with trend does.
Why we read a battery of tests, not just one. The panel is unbalanced, macro shocks (oil crises, financial cycles, the Volcker disinflation) hit many countries in the same year, and individual unit-root tests have low power at \(T_i \le 30\). No single test is decisive — we run four tests with different strengths and read them together. Two recurring ideas to keep in mind:
The four tests, in plain language:
| Step | Test | Null hypothesis | Plain-language description |
|---|---|---|---|
| (a) | Pesaran CD pretest | Cross-sectional independence | Looks at residuals across countries to see whether they move together. Rejection means common shocks contaminate the residuals — the tests in (b) become biased and we lean more on (d). |
| (b) | Maddala-Wu/Fisher & IPS | Unit root in every panel | Run an ADF on each country, then combine. Fisher pools unit-level \(p\)-values via \(-2\sum\log p_i\); IPS averages unit-level \(t\)-statistics. Both assume countries are independent — work well when (a) does not reject. |
| (c) | Hadri | Stationarity in every panel | Flips the null. KPSS-style: tests whether all panels are stationary against the alternative of any unit root. Needs a balanced rectangle. Hlouskova & Wagner (2006) document that Hadri over-rejects under CSD — its rejection deserves less weight when (a) flags dependence. |
| (d) | Pesaran CIPS | Unit root, allowing common shocks | A unit-root test built to be robust to the CSD detected in (a). Augments each country’s ADF with cross-sectional means of \(y\) and \(\Delta y\) — those means absorb the common factor before the country-level test runs. |
We keep countries with at least 20 usable observations for tests (a), (b), (d). For Hadri (c) we need a balanced rectangle — every retained country observed in every year of a fixed window — so we take the last 20 years of the panel and keep countries with full coverage there.
# ---- (i) Build the two analysis panels ---------------------------------------
# Unbalanced panel for steps (a), (b), (d): all countries with T_i >= 20.
db_levels <- democracy |>
filter(!is.na(ln_gdpw)) |>
group_by(country) |>
filter(n() >= 20, sd(ln_gdpw) > 0) |>
ungroup()
# Balanced rectangle for Hadri (step c): last 20 years of the panel,
# keeping only countries observed in every one of those years.
T_target <- 20
window_start <- max(democracy$year) - T_target + 1
window_end <- max(democracy$year)
db_levels_balanced <- democracy |>
filter(year >= window_start, year <= window_end, !is.na(ln_gdpw)) |>
group_by(country) |>
filter(n() == T_target, sd(ln_gdpw) > 0) |>
ungroup()
pdata_levels <- pdata.frame(db_levels, index = c("country", "year"))
# Quote eligibility so students see how much of the panel is being tested.
sprintf("Unbalanced panel for (a)/(b)/(d): %d countries, %d country-years.",
n_distinct(db_levels$country), nrow(db_levels))
## [1] "Unbalanced panel for (a)/(b)/(d): 116 countries, 3897 country-years."
sprintf("Balanced subset for Hadri (c): %d countries x %d years (%d-%d).",
n_distinct(db_levels_balanced$country),
T_target, window_start, window_end)
## [1] "Balanced subset for Hadri (c): 99 countries x 20 years (1971-1990)."
# ---- (a) Pesaran CD: is there cross-sectional dependence? --------------------
# Tests CSD in residuals of a within model that absorbs country FE and a global
# linear time trend. Rejection says common shocks beyond the global trend
# contaminate the residuals — first-gen tests are then biased. (`pcdtest` with
# a within model needs at least one regressor; `year` plays that role.)
cd_levels <- pcdtest(ln_gdpw ~ year, data = pdata_levels,
model = "within", test = "cd")
# ---- (b) First-generation tests: H0 = unit root in all panels ----------------
# `exo = "trend"` includes constant + linear trend in each unit's
# Dickey-Fuller regression, so H0 = I(1)-with-drift, H1 = trend-stationary.
# Lag length picked by SIC up to pmax = 4.
mw_levels <- purtest(ln_gdpw ~ trend,
data = db_levels,
index = c("country", "year"),
test = "madwu",
lags = "SIC", pmax = 4,
exo = "trend")
ips_levels <- purtest(ln_gdpw ~ trend,
data = db_levels,
index = c("country", "year"),
test = "ips",
lags = "SIC", pmax = 4,
exo = "trend")
# ---- (c) Hadri: H0 = stationarity in all panels (KPSS-style) -----------------
# Hadri uses a long-run-variance correction (not ADF lags), so we leave the
# bandwidth at its default and just signal `exo = "trend"` for detrending.
hadri_levels <- purtest(ln_gdpw ~ trend,
data = db_levels_balanced,
index = c("country", "year"),
test = "hadri",
exo = "trend")
# ---- (d) Pesaran CIPS: second-generation, robust to CSD ----------------------
# Augments each unit's ADF regression with cross-sectional means of y and
# Delta y to absorb the common factor. `type = "trend"` includes a trend.
cips_levels <- cipstest(pdata_levels$ln_gdpw,
lags = 2,
type = "trend")
# ---- Tabulate the constellation. The `direction` column makes explicit ------
# what a rejection means: rejecting H0 in (b)/(d) favors I(0); rejecting H0
# in (c) — Hadri — favors I(1). High p-values mean the opposite.
panel_tests <- tibble(
step = c("(a)", "(b)", "(b)", "(c)", "(d)"),
test = c("Pesaran CD", "Maddala-Wu/Fisher", "IPS", "Hadri", "Pesaran CIPS"),
null = c("Cross-sectional independence",
"Unit root in all panels",
"Unit root in all panels",
"Stationarity in all panels",
"Unit root with common-factor correction"),
direction = c("CSD pretest",
"Reject => favors I(0)",
"Reject => favors I(0)",
"Reject => favors I(1)",
"Reject => favors I(0)"),
p_value = c(cd_levels$p.value,
mw_levels$statistic$p.value,
ips_levels$statistic$p.value,
hadri_levels$statistic$p.value,
cips_levels$p.value)
) |>
mutate(decision = if_else(p_value < 0.05, "Reject null", "Do not reject null"))
panel_tests |>
mutate(p_value = scales::pvalue(p_value)) |>
kable(caption = "Panel diagnostic constellation for log GDP per worker in levels.")
| step | test | null | direction | p_value | decision |
|---|---|---|---|---|---|
| (a) | Pesaran CD | Cross-sectional independence | CSD pretest | <0.001 | Reject null |
| (b) | Maddala-Wu/Fisher | Unit root in all panels | Reject => favors I(0) | 0.040 | Reject null |
| (b) | IPS | Unit root in all panels | Reject => favors I(0) | >0.999 | Do not reject null |
| (c) | Hadri | Stationarity in all panels | Reject => favors I(1) | <0.001 | Reject null |
| (d) | Pesaran CIPS | Unit root with common-factor correction | Reject => favors I(0) | 0.100 | Do not reject null |
Reading the four tests together. Step (a) tells us whether the standard panel tests in (b) are reliable: if it rejects, common shocks are contaminating the residuals and we should weight (d) — designed for this case — more heavily.
Within step (b), Maddala-Wu/Fisher and IPS combine the same per-country ADFs in different ways. A split between them is informative, not a contradiction: Fisher (Choi 2001) sums \(-2\log p_i\), so a few countries with very small \(p\)-values can drag the statistic across the threshold; IPS averages normalized \(t\)-statistics, which is more robust to outliers. So when Fisher rejects but IPS does not, the rejection comes from a small number of fast-mean-reverting countries (small open economies, post-shock recoveries) — not from panel-wide stationarity.
Hadri (c) flips the null toward stationarity, and under CSD it tends to over-reject — so its rejection here gets discounted.
The test designed for our regime — CIPS (d), robust to the CSD found in (a) — fails to reject the unit-root null. Combining everything: the honest diagnosis is strong persistence consistent with I(1)-type behavior in levels, with the standard caveat that no test fully discriminates near-unit-root stationarity from a true unit root at \(T \le 30\).
If we chose to treat the outcome as difference-stationary, the natural diagnostic is to repeat §1.3.3’s per-country ADF/PP scan, but on the first-differenced series. This is not the path we take below, but it makes the tradeoff explicit: differencing should remove most of the persistence, at the cost of changing the estimand from level effects to growth-rate effects.
# Per-country ADF and PP on the first-differenced series.
# Mirrors the §1.3.3 scan, but on Δ ln_gdpw instead of detrended ln_gdpw.
fd_unit_root_tests <- tibble()
for (c0 in countries) {
# 1. compute first differences for this country
y_diff <- democracy |>
filter(ctyname == c0) |>
arrange(year) |>
mutate(d_ln_gdpw = ln_gdpw - lag(ln_gdpw)) |>
pull(d_ln_gdpw) |>
na.omit()
if (length(y_diff) < 8) next # skip very short series
# 2. run ADF and PP on the differences
adf_p <- adf.test(y_diff)$p.value
pp_p <- pp.test(y_diff)$p.value
# 3. save results
fd_unit_root_tests <- bind_rows(
fd_unit_root_tests,
tibble(ctyname = c0, adf = adf_p, pp = pp_p)
)
}
fd_unit_root_tests |>
pivot_longer(c(adf, pp), names_to = "test", values_to = "p") |>
ggplot(aes(p, fill = test)) +
geom_histogram(bins = 25, alpha = 0.65, position = "identity") +
geom_vline(xintercept = 0.05, linetype = "dashed") +
facet_wrap(~ test, ncol = 2) +
labs(title = "First-differenced log GDP per worker",
subtitle = "Per-country ADF and PP p-values; dashed line marks p = 0.05",
x = "p-value", y = "# countries") +
theme_minimal(base_size = 10) +
theme(legend.position = "none")
ADF and PP p-values on first-differenced log GDP per worker, computed country by country.
Compared with §1.3.3, the bulk of the country-level \(p\)-values now sits below 0.05:
differencing flips most countries from “cannot reject unit root” to
“rejects unit root in favor of stationarity.” That is the picture we
expected if ln_gdpw is I(1).
The pooled Maddala-Wu/Fisher confirms the picture at the panel level:
db_diff <- db_levels |>
group_by(country) |>
mutate(d_ln_gdpw = ln_gdpw - lag(ln_gdpw)) |>
ungroup() |>
drop_na(d_ln_gdpw)
mw_diff <- purtest(d_ln_gdpw ~ 1,
data = db_diff,
index = c("country", "year"),
test = "madwu",
lags = "SIC",
pmax = 4,
exo = "intercept")
mw_diff_unit_p <- imap_dfr(
mw_diff$idres,
~ tibble(country = as.numeric(.y),
p = as.numeric(.x$p.trho),
lags = .x$lags,
T = .x$T)
)
mw_diff_readout <- tibble(
test = "Maddala-Wu / Fisher",
statistic = as.numeric(mw_diff$statistic$statistic),
df = as.numeric(mw_diff$statistic$parameter),
p_value = as.numeric(mw_diff$statistic$p.value)
)
mw_diff_readout |>
mutate(p_value = scales::pvalue(p_value)) |>
kable(caption = "Path B diagnostic: Maddala-Wu/Fisher test on first-differenced log GDP per worker.")
| test | statistic | df | p_value |
|---|---|---|---|
| Maddala-Wu / Fisher | 2548.104 | 232 | <0.001 |
ggplot(mw_diff_unit_p, aes(p)) +
geom_histogram(aes(y = after_stat(density)),
bins = 25,
fill = "steelblue",
alpha = 0.65,
colour = "white") +
geom_density(colour = "firebrick", linewidth = 0.8) +
geom_vline(xintercept = 0.05, linetype = "dashed") +
labs(title = "Path B diagnostic: unit-level p-values after differencing",
subtitle = "First-differenced log GDP per worker; dashed line marks p = 0.05",
x = "Unit-level ADF p-value", y = "Density") +
theme_minimal(base_size = 11)
The first-difference diagnostic is much cleaner. That is exactly why differencing is tempting. The cost is substantive: the outcome becomes growth in log GDP per worker, not the level of log GDP per worker.
ar1_one <- function(y) {
y <- na.omit(y)
if (length(y) < 10) return(NA_real_)
unname(coef(arima(y, order = c(1, 0, 0)))[1])
}
phi_diff <- democracy |>
group_by(country) |>
arrange(year) |>
summarise(phi = ar1_one(ln_gdpw - lag(ln_gdpw)), .groups = "drop")
ggplot(phi_diff, aes(phi)) +
geom_histogram(bins = 25, fill = "steelblue", alpha = 0.7) +
geom_vline(xintercept = mean(phi_diff$phi, na.rm = TRUE),
colour = "firebrick", linetype = "dashed") +
labs(x = expression(hat(phi)[i]), y = "# countries",
subtitle = "Mean across countries marked in red") +
theme_minimal(base_size = 11)
Estimated AR(1) coefficients on first-differenced log GDP per worker, fit one country at a time. A descriptive summary of short-run persistence after differencing — read as the AR(1) part of an ARIMA(1,1,0) only if we accept the I(1) verdict from §1.3.4.
Two specifications carry the dynamics:
| Path | Outcome | Dynamics absorbed by | Trade-offs |
|---|---|---|---|
| A | \(\ln\)-level | Lagged DV in levels (ARDL(1)) | Coefficients are interpretable as level effects; Beck
& Katz (2011) recommend this as the TSCS default;
simcf::ldvsimev propagates it natively. Nickell
bias in dynamic FE for short \(T\) (Lab 6 will fix). |
| B | \(\Delta\ln\) | Lagged differences | Outcome becomes a growth rate; Mundlak’s within/between split is awkward on differences; forecasts must be un-differenced. |
We take Path A for the rest of the lab. This is a defensible modeling choice, not a triumphant test verdict — and that is exactly why Part 2 below estimates a family of specifications (pooled, FE, RE, Mundlak/REWB) and reads whether the substantive conclusion is robust across them, rather than picking one estimator and defending it. We revisit the Nickell-bias correction next week with Arellano–Bond and dynamic panel GMM.
§1.3.4 closed in honest ambiguity: at \(T
\le 30\) with cross-sectional dependence, the data cannot tell us
with confidence whether ln_gdpw is I(1) or near-unit-root
stationary. The pedagogical response is not to pick one
estimator and defend it, but to estimate a family of
specifications that identify the regime effect under different
assumptions, and read whether the conclusion is robust across
them.
Each estimator below is defined by which variance it uses. Following Kropko & Kubinec (2020) and Bell & Jones (2015), every time-varying regressor decomposes into a within and a between component:
\[ x_{it} \;=\; \underbrace{(x_{it} - \bar x_i)}_{x^{W}_{it}\ (\text{within})} \;+\; \underbrace{\bar x_i}_{x^{B}_i\ (\text{between})} \]
Different estimators recover different combinations of \(x^W\) and \(x^B\):
| Specification | What \(\hat\beta\) identifies | Identifying assumption |
|---|---|---|
| Pooled OLS | mixture of within + between | \(u_{it}\perp X_{it}\); common intercept \(\alpha\) for all units |
| Fixed effects (within) | within only (\(\beta_W\)) | \(\varepsilon_{it}\perp x_{it}\) given \(\alpha_i\); allows \(\text{Cov}(\alpha_i, X)\neq 0\) |
| Two-way fixed effects | deviations from country and year means | removes country levels and common year shocks, but can be hard to interpret outside a clear design |
| Random effects | matrix-weighted average of \(\beta_W\) and \(\beta_B\) | \(\alpha_i\perp X_i\) — country intercepts uncorrelated with regressors |
| Mundlak / REWB | \(\beta_W\) and \(\beta_B\) separately | RE on residuals; \(\bar x_i\) absorbs cross-level confounding |
The right specification depends on the research question:
Two pedagogical reminders we will use in §2.7:
The five candidates all start from the same equation:
\[ y_{it} \;=\; \alpha + \beta_1\,\text{reg}_{it} + \beta_2\,\text{edt}_{it} + \beta_3\,(\text{reg}_{it}\!\cdot\!\text{edt}_{it}) + \beta_4\,\text{oil}_i + u_{it} \]
§1.3.7 chose Path A: ARDL(1) in levels — a lagged DV \(y_{i,t-1}\) to absorb the I(1)-like persistence. For pedagogical clarity, each subsection below fits the static version and the dynamic version(s) side-by-side, so you can see directly what dynamics buy you. The full menu by subsection:
| Subsection | Static | AR(1) (lagged DV) | ARDL(1,1) (+ lagged reg) |
corARMA(1) (AR(1) errors) |
|---|---|---|---|---|
| 2.1 Pooled OLS | ✓ | ✓ | ✓ | — |
| 2.2 FE (within) | ✓ | ✓ | ✓ | — |
| 2.3 TWFE | ✓ | ✓ | ✓ | — |
| 2.4 RE | ✓ | — | — | ✓ |
| 2.5 Mundlak / REWB | ✓ | — | — | ✓ |
Label note. “AR(1)” in this lab means an AR(1) on the outcome — i.e., a lagged dependent variable in the regression. corARMA(1) is an AR(1) on the errors. The two are observationally close but interpret differently — the §2.4 RE block walks through both side by side.
Two methodological notes built into this menu:
nlme::lme.
feols and plm cannot fit it; cluster-robust /
Driscoll–Kraay SEs play the same role for those engines.# Build the dynamic data once for all of Part 2 (and Part 3):
# ln_gdpw_lag — lagged outcome (for AR(1) and ARDL specs in levels)
# reg_lag — lagged treatment (for ARDL(1,1))
# edt_lag — lagged moderator (so the lagged interaction reg_lag × edt_lag
# is the t-1 version of the contemporaneous reg × edt)
# d_ln_gdpw — first difference of the outcome (for §2.1 FD specs)
# d_ln_gdpw_lag — lag of the differenced outcome (for FD-AR(1) and FD-ARDL(1,1))
democracy_dyn <- democracy |>
group_by(country) |>
arrange(year) |>
mutate(ln_gdpw_lag = lag(ln_gdpw),
reg_lag = lag(reg),
edt_lag = lag(edt),
d_ln_gdpw = ln_gdpw - ln_gdpw_lag,
d_ln_gdpw_lag = lag(d_ln_gdpw)) |>
ungroup()
We start with pooled OLS — every country-year exchangeable, one common intercept. Pooled OLS cannot recover the within or between effect cleanly, but it gives a useful baseline and a place to confront the nonstationarity flagged by §1.3.4 directly. The level AR(1) here will show \(\hat\phi\) close to 1, the unmistakable signature of an I(1) outcome; differencing the outcome restores stationarity.
Four specifications.
| Col | Spec | Outcome | Right-hand side |
|---|---|---|---|
| (1) | Static | \(y_{it}\) | \(\beta X_{it}\) |
| (2) | AR(1) on level | \(y_{it}\) | \(\phi\,y_{i,t-1} + \beta X_{it}\) |
| (3) | FD + AR(1) | \(\Delta y_{it}\) | \(\phi\,\Delta y_{i,t-1} + \beta X_{it}\) |
| (4) | FD + ARDL(1,1) | \(\Delta y_{it}\) | \(\phi\,\Delta y_{i,t-1} + \beta_0\,\text{reg}_{it} + \beta_1\,\text{reg}_{i,t-1} + \gamma_0\,\text{reg}_{it}\!\cdot\!\text{edt}_{it} + \gamma_1\,\text{reg}_{i,t-1}\!\cdot\!\text{edt}_{i,t-1} + \dots\) |
Logic. A level AR(1) with \(\hat\phi \approx 1\) is the I(1) signature: the lagged DV absorbs essentially all variance in \(y_{it}\) and the implied long-run multiplier \(\hat\beta/(1-\hat\phi)\) is unstable. Differencing the outcome flips the model to a growth-rate equation — a stationary quantity — and the lagged \(\Delta y_{i,t-1}\) then carries growth-rate dynamics, which mean-revert quickly. The ARDL(1,1) version of FD additionally lets the regime effect (and its moderation by education) spread over a one-year window.
# Drop rows with missing variables once, so all four fits use the same N.
db_pool <- democracy_dyn |>
drop_na(ln_gdpw, ln_gdpw_lag, d_ln_gdpw, d_ln_gdpw_lag,
reg, reg_lag, edt, edt_lag, oil)
# (1) Static pooled OLS in levels: no dynamics.
fit_pool <- lm(ln_gdpw ~ reg * edt + oil,
data = db_pool)
# (2) AR(1) on level: add the lagged outcome. Expect phi near 1 (I(1) signature).
fit_pool_dyn <- lm(ln_gdpw ~ ln_gdpw_lag + reg * edt + oil,
data = db_pool)
# (3) FD + AR(1): outcome is Delta y; add lagged Delta y to absorb growth-rate dynamics.
fit_pool_fd_dyn <- lm(d_ln_gdpw ~ d_ln_gdpw_lag + reg * edt + oil,
data = db_pool)
# (4) FD + ARDL(1,1): also lag the treatment and its interaction with edt.
fit_pool_fd_ardl11 <- lm(d_ln_gdpw ~ d_ln_gdpw_lag +
reg + reg_lag +
edt + edt_lag +
reg:edt + reg_lag:edt_lag +
oil,
data = db_pool)
show_reg(
list(fit_pool, fit_pool_dyn, fit_pool_fd_dyn, fit_pool_fd_ardl11),
custom.model.names = c("(1) Static", "(2) AR(1) level",
"(3) FD + AR(1)", "(4) FD + ARDL(1,1)"),
custom.coef.map = list(
"(Intercept)" = "(Intercept)",
"ln_gdpw_lag" = "y_lag (φ, level)",
"d_ln_gdpw_lag" = "Δy_lag (φ, FD)",
"reg" = "reg",
"reg_lag" = "reg_lag",
"edt" = "edt",
"edt_lag" = "edt_lag",
"reg:edt" = "reg × edt",
"reg_lag:edt_lag" = "reg_lag × edt_lag",
"oil" = "oil"
),
digits = 3
)
|
|
|
|
|
|---|---|---|---|---|
| (Intercept) | 7.277*** | 0.031* | 0.016*** | 0.016*** |
| (0.028) | (0.014) | (0.003) | (0.003) | |
| y_lag (φ, level) | 0.998*** | |||
| (0.002) | ||||
| Δy_lag (φ, FD) | 0.151*** | 0.151*** | ||
| (0.019) | (0.019) | |||
| reg | 0.633*** | 0.005 | 0.003 | 0.012 |
| (0.059) | (0.006) | (0.006) | (0.019) | |
| reg_lag | -0.010 | |||
| (0.019) | ||||
| edt | 0.233*** | 0.001 | 0.000 | 0.002 |
| (0.006) | (0.001) | (0.001) | (0.007) | |
| edt_lag | -0.002 | |||
| (0.007) | ||||
| reg × edt | -0.027** | -0.001 | -0.000 | -0.001 |
| (0.009) | (0.001) | (0.001) | (0.004) | |
| reg_lag × edt_lag | 0.001 | |||
| (0.004) | ||||
| oil | 0.752*** | -0.000 | -0.002 | -0.002 |
| (0.037) | (0.004) | (0.004) | (0.004) | |
| R2 | 0.656 | 0.996 | 0.023 | 0.023 |
| Adj. R2 | 0.656 | 0.996 | 0.021 | 0.020 |
| Num. obs. | 2734 | 2734 | 2734 | 2734 |
| ***p < 0.001; **p < 0.01; *p < 0.05 | ||||
Reading.
reg × edt
and the main effects on reg and edt are
identified mostly off cross-country variation.ln_gdpw_lag is
close to 1 — the lagged DV absorbs essentially all
variance in \(y_{it}\), contemporaneous
coefficients shrink toward zero, and \(\hat\beta/(1-\hat\phi)\) is unstable.
This is the nonstationarity signature §1.3.4 warned about.reg and lagged reg × edt terms show whether
democratization’s growth-rate effect — and its moderation by education —
spreads over time.The cleanest specification is column (4): FD + ARDL(1,1). We pick this as the best pooled model going forward.
We check for residual autocorrelation in the FD + ARDL(1,1) pooled
model with a panel Breusch-Godfrey test (plm::pbgtest),
which respects the country structure of the residuals.
# Refit as a plm pooling model so pbgtest and the panel SE corrections work.
fit_pool_fd_ardl11_plm <- plm(
d_ln_gdpw ~ d_ln_gdpw_lag +
reg + reg_lag +
edt + edt_lag +
reg:edt + reg_lag:edt_lag +
oil,
data = db_pool,
index = c("country_f", "year_f"),
model = "pooling"
)
pbgtest(fit_pool_fd_ardl11_plm, order = 2)
##
## Breusch-Godfrey/Wooldridge test for serial correlation in panel models
##
## data: d_ln_gdpw ~ d_ln_gdpw_lag + reg + reg_lag + edt + edt_lag + reg:edt + ...
## chisq = 7.4933, df = 2, p-value = 0.0236
## alternative hypothesis: serial correlation in idiosyncratic errors
If the test rejects, residuals still carry serial correlation after the ARDL(1,1) parameterization — common in macro-panels because the dynamics in (4) are a one-equation approximation, not the full DGP. The fix is to leave the point estimates alone and adjust the standard errors.
OLS point estimates are unbiased even with autocorrelated and
cross-sectionally dependent errors; the SEs are not. Three options for
the panel SE, all available via plm:
ses_pool_fd <- list(
cluster = vcovHC(fit_pool_fd_ardl11_plm, method = "arellano"),
pcse = vcovBK(fit_pool_fd_ardl11_plm),
driscoll = vcovSCC(fit_pool_fd_ardl11_plm)
)
target_pool <- c("d_ln_gdpw_lag", "reg", "reg_lag",
"reg:edt", "reg_lag:edt_lag")
ses_pool_tab <- map(ses_pool_fd, function(V)
coeftest(fit_pool_fd_ardl11_plm, vcov. = V)[target_pool, 2])
bind_cols(coefficient = target_pool, as_tibble(ses_pool_tab)) |>
kable(digits = 4,
caption = "SE corrections on the FD + ARDL(1,1) pooled model.") |>
kable_styling(full_width = FALSE)
| coefficient | cluster | pcse | driscoll |
|---|---|---|---|
| d_ln_gdpw_lag | 0.0356 | 0.0198 | 0.0411 |
| reg | 0.0210 | 0.0183 | 0.0236 |
| reg_lag | 0.0206 | 0.0177 | 0.0243 |
| reg:edt | 0.0040 | 0.0039 | 0.0043 |
| reg_lag:edt_lag | 0.0040 | 0.0039 | 0.0045 |
We adopt PCSE (Beck & Katz 1995) as the default for pooled OLS, with two reasons:
If the §2.1.1 BG test rejects strongly, we switch to Driscoll-Kraay instead — it layers serial correlation on top of cross-country dependence and is the most conservative choice. See §2.7 for the comprehensive SE menu and the same comparison applied to the FE specifications.
Equation (static). Each country gets its own intercept \(\alpha_i\), sweeping out time-invariant unit-level confounders:
\[y_{it} = \alpha_i + \beta_1\,\text{reg}_{it} + \beta_2\,\text{edt}_{it} + \beta_3\,(\text{reg}_{it}\!\cdot\!\text{edt}_{it}) + \varepsilon_{it}\]
Algebraically equivalent to the within transformation — demeaning every variable by its country mean:
\[\tilde y_{it} = y_{it} - \bar y_i,\qquad \tilde x_{it} = x_{it} - \bar x_i,\qquad \tilde y_{it} = \beta\,\tilde x_{it} + \tilde\varepsilon_{it}\]
Equation (dynamic — AR(1)). Country FE plus a lagged DV:
\[y_{it} = \alpha_i + \phi\,y_{i,t-1} + \beta_1\,\text{reg}_{it} + \beta_2\,\text{edt}_{it} + \beta_3\,(\text{reg}_{it}\!\cdot\!\text{edt}_{it}) + \varepsilon_{it}\]
Equation (dynamic — ARDL(1,1)). Add lagged
reg, lagged edt, and the lagged interaction.
With one lag of the treatment and the moderator, students can see
whether the regime effect shows up contemporaneously, with a one-period
delay, or both:
\[y_{it} = \alpha_i + \phi\,y_{i,t-1} + \beta_0\,\text{reg}_{it} + \beta_1\,\text{reg}_{i,t-1} + \beta_2\,\text{edt}_{it} + \beta_3\,\text{edt}_{i,t-1} + \gamma_0\,(\text{reg}_{it}\!\cdot\!\text{edt}_{it}) + \gamma_1\,(\text{reg}_{i,t-1}\!\cdot\!\text{edt}_{i,t-1}) + \varepsilon_{it}\]
Logic. Identifies \(\beta\) from within-country
variation only — changes over time for the same
country. Time-invariant oil is absorbed by \(\alpha_i\). The interaction
reg × edt survives because both reg
and edt are time-varying. Three quantities to read in the
ARDL(1,1) column at education level \(\bar
e\): the impact effect \(\hat\beta_0 +
\hat\gamma_0\bar e\), the cumulative one-period effect \((\hat\beta_0 + \hat\beta_1) + (\hat\gamma_0 +
\hat\gamma_1)\bar e\), and the long-run effect \(\big[(\hat\beta_0 + \hat\beta_1) + (\hat\gamma_0 +
\hat\gamma_1)\bar e\big]/(1-\hat\phi)\). Nickell (1981)
bias caveat: with country FE and short \(T\), \(\hat\phi\) is biased downward by \(\approx -(1+\phi)/(T-1)\) (Lab 6 will fix
this with dynamic-panel GMM).
# Static FE: feols is our default — fast on large panels, native cluster SEs.
# `oil` is time-invariant in the panel and is silently absorbed by country FE.
fit_fe <- feols(ln_gdpw ~ reg * edt + oil | country,
data = democracy, cluster = ~ country)
# Dynamic FE — AR(1): add the lagged outcome.
fit_fe_dyn <- feols(ln_gdpw ~ ln_gdpw_lag + reg * edt + oil | country,
data = democracy_dyn, cluster = ~ country)
# Dynamic FE — ARDL(1,1): also add the lagged treatment, lagged moderator,
# and lagged interaction.
fit_fe_ardl11 <- feols(
ln_gdpw ~ ln_gdpw_lag +
reg + reg_lag +
edt + edt_lag +
reg:edt + reg_lag:edt_lag +
oil |
country,
data = democracy_dyn, cluster = ~ country
)
# We will need the per-country intercepts from the FE-ARDL(1,1) in Part 3 (IRF).
fe_intercepts <- fixef(fit_fe_ardl11)$country
range(fe_intercepts)
## [1] 0.3793449 0.6484020
show_reg(
list(fit_fe, fit_fe_dyn, fit_fe_ardl11),
custom.model.names = c("Static", "AR(1)", "ARDL(1,1)"),
custom.coef.map = list(
"ln_gdpw_lag" = "y_lag (φ)",
"reg" = "reg",
"reg_lag" = "reg_lag",
"edt" = "edt",
"edt_lag" = "edt_lag",
"reg:edt" = "reg × edt",
"reg_lag:edt_lag" = "reg_lag × edt_lag"
),
digits = 3
)
| Static | AR(1) | ARDL(1,1) | |
|---|---|---|---|
| y_lag (φ) | 0.944*** | 0.943*** | |
| (0.008) | (0.008) | ||
| reg | 0.117 | -0.008 | 0.017 |
| (0.089) | (0.012) | (0.022) | |
| reg_lag | -0.033 | ||
| (0.023) | |||
| edt | 0.180*** | -0.004 | 0.002 |
| (0.018) | (0.003) | (0.006) | |
| edt_lag | -0.007 | ||
| (0.006) | |||
| reg × edt | -0.029 | 0.001 | -0.002 |
| (0.019) | (0.002) | (0.004) | |
| reg_lag × edt_lag | 0.004 | ||
| (0.004) | |||
| Num. obs. | 2900 | 2847 | 2787 |
| Num. groups: country | 113 | 113 | 113 |
| R2 (full model) | 0.971 | 0.997 | 0.997 |
| R2 (proj model) | 0.429 | 0.938 | 0.933 |
| Adj. R2 (full model) | 0.970 | 0.997 | 0.997 |
| Adj. R2 (proj model) | 0.428 | 0.938 | 0.933 |
| ***p < 0.001; **p < 0.01; *p < 0.05 | |||
The same point estimates are also available via
plm(model = "within") and via LSDV
(lm(... + factor(country))); we use feols
because it is faster and exposes cluster SEs natively. We re-fit with
plm in §2.7 because vcovBK (PCSE) and
vcovSCC (Driscoll–Kraay) require a plm
object.
Reading. The static within fit shows whether the
moderation reg × edt survives once between-country
variation is stripped out. Adding the LDV concentrates variance on \(y_{i,t-1}\) and shrinks the contemporaneous
coefficients into short-run effects. In the ARDL(1,1) fit, comparing the
impact (\(\hat\beta_0\), \(\hat\gamma_0\)) to the one-period
cumulative (\(\hat\beta_0 +
\hat\beta_1\), \(\hat\gamma_0 +
\hat\gamma_1\)) tells us whether democratization’s effect — and
its moderation by education — concentrates at impact or spreads over the
next year.
Equation (static). Country and year each get their own intercept:
\[ y_{it} = \alpha_i + \lambda_t + \beta_1\,\text{reg}_{it} + \beta_2\,\text{edt}_{it} + \beta_3\,(\text{reg}_{it}\!\cdot\!\text{edt}_{it}) + \varepsilon_{it} \]
Equation (dynamic — AR(1)). Add the lagged DV:
\[ y_{it} = \alpha_i + \lambda_t + \phi\,y_{i,t-1} + \beta_1\,\text{reg}_{it} + \beta_2\,\text{edt}_{it} + \beta_3\,(\text{reg}_{it}\!\cdot\!\text{edt}_{it}) + \varepsilon_{it} \]
Equation (dynamic — ARDL(1,1)). Also lag the treatment, the moderator, and the interaction:
\[ y_{it} = \alpha_i + \lambda_t + \phi\,y_{i,t-1} + \beta_0\,\text{reg}_{it} + \beta_1\,\text{reg}_{i,t-1} + \beta_2\,\text{edt}_{it} + \beta_3\,\text{edt}_{i,t-1} + \gamma_0\,(\text{reg}_{it}\!\cdot\!\text{edt}_{it}) + \gamma_1\,(\text{reg}_{i,t-1}\!\cdot\!\text{edt}_{i,t-1}) + \varepsilon_{it} \]
Logic. Country FE removes time-invariant country differences; year FE removes shocks common to all countries in the same year. Useful when common global events affect the outcome, but Kropko & Kubinec’s warning applies: the TWFE coefficient is not simply “the effect of \(x\) controlling for country and year.” It is the slope after removing both country and year means — use it only when that comparison matches the research question.
# Static TWFE: country + year intercepts. `oil` is time-invariant and absorbed.
fit_twfe <- feols(ln_gdpw ~ reg * edt + oil |
country + year,
data = democracy, cluster = ~ country)
# Dynamic TWFE — AR(1): add the lagged outcome.
fit_twfe_dyn <- feols(ln_gdpw ~ ln_gdpw_lag + reg * edt + oil |
country + year,
data = democracy_dyn, cluster = ~ country)
# Dynamic TWFE — ARDL(1,1): also add lagged reg, lagged edt, and lagged interaction.
fit_twfe_ardl11 <- feols(
ln_gdpw ~ ln_gdpw_lag +
reg + reg_lag +
edt + edt_lag +
reg:edt + reg_lag:edt_lag +
oil |
country + year,
data = democracy_dyn, cluster = ~ country
)
show_reg(
list(fit_twfe, fit_twfe_dyn, fit_twfe_ardl11),
custom.model.names = c("Static", "AR(1)", "ARDL(1,1)"),
custom.coef.map = list(
"ln_gdpw_lag" = "y_lag (φ)",
"reg" = "reg",
"reg_lag" = "reg_lag",
"edt" = "edt",
"edt_lag" = "edt_lag",
"reg:edt" = "reg × edt",
"reg_lag:edt_lag" = "reg_lag × edt_lag"
),
digits = 3
)
| Static | AR(1) | ARDL(1,1) | |
|---|---|---|---|
| y_lag (φ) | 0.927*** | 0.926*** | |
| (0.011) | (0.011) | ||
| reg | 0.096 | 0.001 | 0.016 |
| (0.073) | (0.013) | (0.022) | |
| reg_lag | -0.020 | ||
| (0.022) | |||
| edt | 0.030 | -0.003 | -0.002 |
| (0.022) | (0.003) | (0.006) | |
| edt_lag | -0.002 | ||
| (0.006) | |||
| reg × edt | -0.015 | 0.000 | -0.001 |
| (0.016) | (0.003) | (0.004) | |
| reg_lag × edt_lag | 0.002 | ||
| (0.004) | |||
| Num. obs. | 2900 | 2847 | 2787 |
| Num. groups: country | 113 | 113 | 113 |
| Num. groups: year | 28 | 28 | 27 |
| R2 (full model) | 0.980 | 0.997 | 0.997 |
| R2 (proj model) | 0.010 | 0.847 | 0.844 |
| Adj. R2 (full model) | 0.979 | 0.997 | 0.997 |
| Adj. R2 (proj model) | 0.009 | 0.847 | 0.844 |
| ***p < 0.001; **p < 0.01; *p < 0.05 | |||
Reading. Compared to country-only FE, TWFE absorbs
additional variance through the year intercepts \(\lambda_t\) — these capture global shocks
(oil crises, financial cycles) that affect every country at once. The
dynamic columns show the same shrinkage of contemporaneous coefficients
we saw in §2.2 once \(y_{i,t-1}\)
enters. The ARDL(1,1) row gives a cleaner read on the timing: near-zero
reg_lag and reg_lag × edt_lag say
democratization’s effect (and its moderation) are contemporaneous;
nonzero values say they spread over the next year.
Equation (static). Country intercepts are random draws from a common population:
\[ y_{it} = \alpha + \beta_1\,\text{reg}_{it} + \beta_2\,\text{edt}_{it} + \beta_3\,(\text{reg}_{it}\!\cdot\!\text{edt}_{it}) + \beta_4\,\text{oil}_i + (u_i + \varepsilon_{it}),\qquad u_i \sim \mathcal{N}(0,\sigma_\alpha^2) \]
Logic. \(\hat\beta\) is a matrix-weighted average of within and between estimators, with weight
\[ \theta = 1 - \sqrt{\frac{\sigma_\varepsilon^2}{\sigma_\varepsilon^2 + T\,\sigma_\alpha^2}} \]
— so \(\theta=0\) collapses to
pooled OLS and \(\theta=1\) collapses
to FE. RE is partial pooling toward the population
mean. Two key features: RE estimates the coefficient on
oil (FE cannot, since oil is
time-invariant), at the cost of needing \(u_i
\perp X_i\) — which the §1.1 DAG says is suspect.
We fit two versions:
Why no LDV in RE here. A lagged DV in an RE model carries the Wooldridge (2005) initial-conditions issue: \(y_{i,0}\) is correlated with the random intercept \(u_i\), so the lagged DV is not strictly exogenous in the RE framework. The methodologically clean fix is Wooldridge’s correlated-random-effects setup or Arellano-Bond GMM (Lab 6). For RE here, corARMA(1) handles the persistence in the residuals without disturbing the random-intercept structure — the natural dynamic version, just like for Mundlak/REWB in §2.5.
# Static RE.
fit_re <- lme(
fixed = ln_gdpw ~ reg * edt + oil,
random = ~ 1 | country,
data = democracy,
na.action = na.omit,
method = "REML"
)
# RE — corARMA(1): AR(1) on the residuals, no lagged DV.
fit_re_corar <- lme(
fixed = ln_gdpw ~ reg * edt + oil,
random = ~ 1 | country,
correlation = corARMA(form = ~ year | country, p = 1, q = 0),
data = democracy,
na.action = na.omit,
method = "REML"
)
show_reg(
list(fit_re, fit_re_corar),
custom.model.names = c("Static", "corARMA(1)"),
custom.coef.map = list(
"(Intercept)" = "(Intercept)",
"reg" = "reg",
"edt" = "edt",
"reg:edt" = "reg × edt",
"oil" = "oil"
),
digits = 3
)
| Static | corARMA(1) | |
|---|---|---|
| (Intercept) | 7.703*** | 8.238*** |
| (0.071) | (0.095) | |
| reg | 0.119*** | 0.059** |
| (0.030) | (0.019) | |
| edt | 0.183*** | 0.058*** |
| (0.004) | (0.006) | |
| reg × edt | -0.027*** | -0.012** |
| (0.006) | (0.004) | |
| oil | 0.539** | 0.257 |
| (0.201) | (0.267) | |
| AIC | -1009.712 | -7024.916 |
| BIC | -967.916 | -6977.150 |
| Log Likelihood | 511.856 | 3520.458 |
| Num. obs. | 2900 | 2900 |
| Num. groups: country | 113 | 113 |
| ***p < 0.001; **p < 0.01; *p < 0.05 | ||
Reading. Compare the reg row across the
two columns. In the static fit, \(\hat\beta_{\text{reg}}\) is the
partially-pooled level effect with no dynamic correction; the residuals
are assumed iid which is implausible here (§1.3.4 found I(1)-like
persistence). In corARMA(1), \(\hat\beta_{\text{reg}}\) is the
partially-pooled level effect with an AR(1) error structure —
it is already long-run by construction. Adding
corARMA(1) typically widens SEs (the residuals are now allowed to be
persistent rather than iid), but should not move the within-between
mixture much. The §3.5 IRF below gets the dynamic story from the
FE-ARDL(1,1), where the lagged-DV form gives a clean closed-form impulse
response.
Equation (static). Decompose every time-varying
regressor — including the interaction reg × edt — into
within and between components, and put both in the
model with separate coefficients:
\[ y_{it} = \alpha + \beta_W^{\text{reg}}\,\text{reg}^{W}_{it} + \beta_B^{\text{reg}}\,\overline{\text{reg}}_i + \beta_W^{\text{edt}}\,\text{edt}^{W}_{it} + \beta_B^{\text{edt}}\,\overline{\text{edt}}_i + \gamma_W\,(\text{reg}\!\cdot\!\text{edt})^{W}_{it} + \gamma_B\,\overline{(\text{reg}\!\cdot\!\text{edt})}_i + \beta_4\,\text{oil}_i + (u_i + \varepsilon_{it}) \]
where \(x^{W}_{it} = x_{it} - \bar x_i\) and we treat \(\text{reg}\cdot\text{edt}\) as a single variable to be demeaned, then split.
Equation (dynamic — corARMA(1)). Same equation; AR(1) on the residuals:
\[ \dots + u_{it}, \qquad u_{it} = \rho u_{i,t-1} + \varepsilon_{it} \]
Logic. This is Mundlak (1978), repackaged by Bell & Jones (2015) as REWB (random effects within-between). The within-mean coefficient \(\beta_W\) picks up the FE-style identification; the between-mean coefficient \(\beta_B\) picks up the cross-sectional contrast that pure RE was using implicitly. We see both, and \(\beta_W = \beta_B\) is testable (Hausman / Mundlak diagnostic in §2.6). Bell & Jones argue REWB should be the default — it returns more information than FE or RE alone.
Why no LDV here. Adding \(y_{i,t-1}\) to the within-between decomposition either (a) imports a between component into a model whose whole point is the within-between split, or (b) creates mechanical collinearity if you decompose the lag itself (\(\bar y_i\) is the empirical analog of \(u_i\)). corARMA absorbs persistence in the residuals without disturbing the decomposition, so it is the natural dynamic version for REWB.
# parameters::demean splits each time-varying regressor into _within and _between.
# Because `reg × edt` involves two time-varying regressors, we precompute the
# product `reg_x_edt` at the country-year level, then demean it as a single
# variable. `oil` is time-invariant — no within variation — so it enters as a
# single between-only term.
mund <- democracy |>
drop_na(ln_gdpw, reg, edt, oil) |>
mutate(reg_x_edt = reg * edt) |>
parameters::demean(select = c("reg", "edt", "reg_x_edt"),
by = "country", verbose = FALSE)
# Static REWB: within + between coefficients estimated separately,
# RE intercept retained.
fit_mundlak <- lme(
fixed = ln_gdpw ~ reg_within + reg_between +
edt_within + edt_between +
reg_x_edt_within + reg_x_edt_between +
oil,
random = ~ 1 | country,
data = mund,
na.action = na.omit,
method = "REML"
)
# Dynamic REWB via corARMA(1). Persistence absorbed in the residuals,
# within/between decomposition preserved.
fit_mundlak_corar <- lme(
fixed = ln_gdpw ~ reg_within + reg_between +
edt_within + edt_between +
reg_x_edt_within + reg_x_edt_between +
oil,
random = ~ 1 | country,
correlation = corARMA(form = ~ year | country, p = 1, q = 0),
data = mund,
na.action = na.omit,
method = "REML"
)
show_reg(
list(fit_mundlak, fit_mundlak_corar),
custom.model.names = c("Static", "+ corARMA(1)"),
custom.coef.map = list(
"(Intercept)" = "(Intercept)",
"reg_within" = "reg (within)",
"reg_between" = "reg (between)",
"edt_within" = "edt (within)",
"edt_between" = "edt (between)",
"reg_x_edt_within" = "reg × edt (within)",
"reg_x_edt_between" = "reg × edt (between)",
"oil" = "oil"
),
digits = 3
)
| Static |
|
|
|---|---|---|
| (Intercept) | 7.216*** | 7.141*** |
| (0.130) | (0.134) | |
| reg (within) | 0.117*** | 0.055** |
| (0.030) | (0.019) | |
| reg (between) | 0.927** | 0.907** |
| (0.324) | (0.334) | |
| edt (within) | 0.180*** | 0.051*** |
| (0.004) | (0.006) | |
| edt (between) | 0.228*** | 0.226*** |
| (0.030) | (0.031) | |
| reg × edt (within) | -0.029*** | -0.012** |
| (0.006) | (0.004) | |
| reg × edt (between) | -0.044 | -0.037 |
| (0.047) | (0.049) | |
| oil | 0.718*** | 0.652*** |
| (0.170) | (0.175) | |
| AIC | -1035.523 | -7109.690 |
| BIC | -975.826 | -7044.023 |
| Log Likelihood | 527.762 | 3565.845 |
| Num. obs. | 2900 | 2900 |
| Num. groups: country | 113 | 113 |
| ***p < 0.001; **p < 0.01; *p < 0.05 | ||
Reading. Six coefficients on the regime/education block, instead of three:
reg_within / reg_between — within-country
effect of becoming democratic vs the cross-country slope.edt_within / edt_between — within-country
effect of rising education vs the cross-country slope.reg_x_edt_within / reg_x_edt_between —
does the moderation by education operate within countries (rising
education changes the regime effect over time within the same country)
or between (more-educated countries have a different regime-growth slope
on average)?Time-invariant oil retains its main effect (RE’s
flexibility, FE’s shortcoming). For our research question — the
within effect of regime change and its moderation by education —
reg_within and reg_x_edt_within are the
relevant estimates. Adding corARMA(1) typically widens SEs but should
not move the within-between split materially.
Two formal tests targeting the same hypothesis:
# Mundlak Wald: jointly tests reg_within = reg_between, and similarly for the
# moderation and edt. The contrast matrix L picks out (within - between) for
# each regressor; W = (Lb)' (LVL')^{-1} (Lb) is chi-square under H0.
b <- fixef(fit_mundlak)
V <- vcov(fit_mundlak)
contrasts_named <- list(
c("reg_within" = 1, "reg_between" = -1),
c("edt_within" = 1, "edt_between" = -1),
c("reg_x_edt_within" = 1, "reg_x_edt_between" = -1)
)
L <- t(sapply(contrasts_named, function(cn) {
row <- setNames(rep(0, length(b)), names(b))
row[names(cn)] <- cn
row
}))
delta <- L %*% b
W <- t(delta) %*% solve(L %*% V %*% t(L)) %*% delta
sprintf("Mundlak Wald = %.2f, df = %d, p = %.3f",
W, nrow(L), 1 - pchisq(W, df = nrow(L)))
## [1] "Mundlak Wald = 52.04, df = 3, p = 0.000"
# Classical Hausman: compares within and RE coefficients on time-varying
# regressors only. Coarser than the Mundlak Wald above, but the two should
# agree in direction. (`oil` is time-invariant; FE drops it, RE estimates it.)
fit_fe_plm <- plm(ln_gdpw ~ reg * edt + oil,
data = democracy, index = c("country_f", "year_f"),
model = "within")
fit_re_plm <- plm(ln_gdpw ~ reg * edt + oil,
data = democracy, index = c("country_f", "year_f"),
model = "random")
phtest(fit_fe_plm, fit_re_plm)
##
## Hausman Test
##
## data: ln_gdpw ~ reg * edt + oil
## chisq = 61.75, df = 3, p-value = 2.485e-13
## alternative hypothesis: one model is inconsistent
The Mundlak Wald is more granular — it tests each within-vs-between contrast individually. Hausman is the classical aggregate version. They should agree in direction.
Important caveat (Bell & Jones 2015). Do not use Hausman as a gatekeeper that “picks” FE vs RE. Even when it rejects, REWB is often the better choice because it returns both within and between estimates. Treat the test as a diagnostic — does my within story differ from my between story? — not a model-selection switch.
OLS / pooled / FE point estimates are unbiased under exogeneity even if errors are autocorrelated and cross-sectionally dependent. The standard errors are not. Lab 1 §3 covered cluster-robust SEs in detail; this section adds the two panel-specific corrections we need here.
| SE | Allows | When |
|---|---|---|
| Cluster-robust by unit | arbitrary within-country correlation | many clusters (≥30-50), residuals roughly independent across countries |
| PCSE (Beck & Katz 1995) | contemporaneous cross-sectional correlation | many countries, TSCS data, residuals correlated across countries in the same year |
| Driscoll–Kraay (1998) | cross-sectional + serial correlation, any pattern | moderate-to-large T; the safest default when both pathologies are present |
Cluster-robust SEs assume independence across clusters, which is fine for classrooms and survey strata but will under-state uncertainty in a TSCS panel where macroeconomic shocks hit many countries in the same year. PCSE corrects for that contemporaneous cross-country dependence; Driscoll–Kraay then layers serial correlation on top.
# vcovBK and vcovSCC require a plm object; use fit_fe_plm fitted in §2.6.
ses <- list(
cluster = vcovHC(fit_fe_plm, method = "arellano"), # cluster by unit
pcse = vcovBK(fit_fe_plm), # Beck-Katz PCSE
driscoll = vcovSCC(fit_fe_plm) # Driscoll-Kraay
)
target <- c("reg", "edt", "reg:edt")
ses_tab <- map(ses, function(V)
coeftest(fit_fe_plm, vcov. = V)[target, 2])
bind_cols(coefficient = target, as_tibble(ses_tab)) |>
kable(digits = 4,
caption = "Standard errors on the within-FE estimate of (reg, edt, reg:edt) under three corrections.") |>
kable_styling(full_width = FALSE)
| coefficient | cluster | pcse | driscoll |
|---|---|---|---|
| reg | 0.0888 | 0.0821 | 0.0628 |
| edt | 0.0175 | 0.0163 | 0.0231 |
| reg:edt | 0.0186 | 0.0175 | 0.0081 |
PCSE and Driscoll–Kraay are typically a notch larger than the cluster-robust SE here, exactly because they admit cross-country dependence that cluster-by-unit assumes away.
We close Part 2 with two side-by-side comparison tables:
show_reg(
list(fit_pool, fit_fe, fit_twfe, fit_re, fit_mundlak),
custom.model.names = c("Pooled OLS", "FE (within)", "TWFE", "RE", "Mundlak / REWB"),
custom.coef.map = list(
reg = "reg",
edt = "edt",
"reg:edt" = "reg x edt",
oil = "oil",
reg_within = "reg (within)",
reg_between = "reg (between)",
edt_within = "edt (within)",
edt_between = "edt (between)",
reg_x_edt_within = "reg x edt (within)",
reg_x_edt_between = "reg x edt (between)"
),
digits = 3,
caption = "Table 1: static specifications."
)
| Pooled OLS | FE (within) | TWFE | RE | Mundlak / REWB | |
|---|---|---|---|---|---|
| reg | 0.633*** | 0.117 | 0.096 | 0.119*** | |
| (0.059) | (0.089) | (0.073) | (0.030) | ||
| edt | 0.233*** | 0.180*** | 0.030 | 0.183*** | |
| (0.006) | (0.018) | (0.022) | (0.004) | ||
| reg x edt | -0.027** | -0.029 | -0.015 | -0.027*** | |
| (0.009) | (0.019) | (0.016) | (0.006) | ||
| oil | 0.752*** | 0.539** | 0.718*** | ||
| (0.037) | (0.201) | (0.170) | |||
| reg (within) | 0.117*** | ||||
| (0.030) | |||||
| reg (between) | 0.927** | ||||
| (0.324) | |||||
| edt (within) | 0.180*** | ||||
| (0.004) | |||||
| edt (between) | 0.228*** | ||||
| (0.030) | |||||
| reg x edt (within) | -0.029*** | ||||
| (0.006) | |||||
| reg x edt (between) | -0.044 | ||||
| (0.047) | |||||
| R2 | 0.656 | ||||
| Adj. R2 | 0.656 | ||||
| Num. obs. | 2734 | 2900 | 2900 | 2900 | 2900 |
| Num. groups: country | 113 | 113 | 113 | 113 | |
| R2 (full model) | 0.971 | 0.980 | |||
| R2 (proj model) | 0.429 | 0.010 | |||
| Adj. R2 (full model) | 0.970 | 0.979 | |||
| Adj. R2 (proj model) | 0.428 | 0.009 | |||
| Num. groups: year | 28 | ||||
| AIC | -1009.712 | -1035.523 | |||
| BIC | -967.916 | -975.826 | |||
| Log Likelihood | 511.856 | 527.762 | |||
| ***p < 0.001; **p < 0.01; *p < 0.05 | |||||
# Pooled OLS uses the FD + ARDL(1,1) model from §2.1 (the level ARDL(1,1) was
# nonstationary). FE/TWFE/RE use the level ARDL(1,1). Mundlak uses corARMA(1)
# since LDV breaks the within/between decomposition.
show_reg(
list(fit_pool_fd_ardl11, fit_fe_ardl11, fit_twfe_ardl11, fit_re_corar,
fit_mundlak_corar),
custom.model.names = c("Pooled — FD ARDL(1,1)", "FE — ARDL(1,1)",
"TWFE — ARDL(1,1)", "RE — corARMA(1)",
"Mundlak — corARMA(1)"),
custom.coef.map = list(
"ln_gdpw_lag" = "y_lag (φ, level)",
"d_ln_gdpw_lag" = "Δy_lag (φ, FD)",
reg = "reg",
reg_lag = "reg_lag",
edt = "edt",
edt_lag = "edt_lag",
"reg:edt" = "reg x edt",
"reg_lag:edt_lag" = "reg_lag x edt_lag",
oil = "oil",
reg_within = "reg (within)",
reg_between = "reg (between)",
edt_within = "edt (within)",
edt_between = "edt (between)",
reg_x_edt_within = "reg x edt (within)",
reg_x_edt_between = "reg x edt (between)"
),
digits = 3,
caption = "Table 2: dynamic specifications. Pooled OLS uses the FD + ARDL(1,1) picked in §2.1 (its outcome is Δy, hence the 'φ, FD' row). FE/TWFE use the level ARDL(1,1). RE and Mundlak use corARMA(1) — LDV in RE faces the Wooldridge initial-conditions issue; LDV in REWB breaks the within-between decomposition."
)
| Pooled — FD ARDL(1,1) | FE — ARDL(1,1) | TWFE — ARDL(1,1) | RE — corARMA(1) | Mundlak — corARMA(1) | |
|---|---|---|---|---|---|
| y_lag (φ, level) | 0.943*** | 0.926*** | |||
| (0.008) | (0.011) | ||||
| Δy_lag (φ, FD) | 0.151*** | ||||
| (0.019) | |||||
| reg | 0.012 | 0.017 | 0.016 | 0.059** | |
| (0.019) | (0.022) | (0.022) | (0.019) | ||
| reg_lag | -0.010 | -0.033 | -0.020 | ||
| (0.019) | (0.023) | (0.022) | |||
| edt | 0.002 | 0.002 | -0.002 | 0.058*** | |
| (0.007) | (0.006) | (0.006) | (0.006) | ||
| edt_lag | -0.002 | -0.007 | -0.002 | ||
| (0.007) | (0.006) | (0.006) | |||
| reg x edt | -0.001 | -0.002 | -0.001 | -0.012** | |
| (0.004) | (0.004) | (0.004) | (0.004) | ||
| reg_lag x edt_lag | 0.001 | 0.004 | 0.002 | ||
| (0.004) | (0.004) | (0.004) | |||
| oil | -0.002 | 0.257 | 0.652*** | ||
| (0.004) | (0.267) | (0.175) | |||
| reg (within) | 0.055** | ||||
| (0.019) | |||||
| reg (between) | 0.907** | ||||
| (0.334) | |||||
| edt (within) | 0.051*** | ||||
| (0.006) | |||||
| edt (between) | 0.226*** | ||||
| (0.031) | |||||
| reg x edt (within) | -0.012** | ||||
| (0.004) | |||||
| reg x edt (between) | -0.037 | ||||
| (0.049) | |||||
| R2 | 0.023 | ||||
| Adj. R2 | 0.020 | ||||
| Num. obs. | 2734 | 2787 | 2787 | 2900 | 2900 |
| Num. groups: country | 113 | 113 | 113 | 113 | |
| R2 (full model) | 0.997 | 0.997 | |||
| R2 (proj model) | 0.933 | 0.844 | |||
| Adj. R2 (full model) | 0.997 | 0.997 | |||
| Adj. R2 (proj model) | 0.933 | 0.844 | |||
| Num. groups: year | 27 | ||||
| AIC | -7024.916 | -7109.690 | |||
| BIC | -6977.150 | -7044.023 | |||
| Log Likelihood | 3520.458 | 3565.845 | |||
| ***p < 0.001; **p < 0.01; *p < 0.05 | |||||
Headline reading.
reg × edt to its FE counterpart: if the moderation
collapses under FE, it was driven by between-country variation
(more-educated countries differ from less-educated countries in their
average regime-growth slope). If it survives, the moderation is a
within-country phenomenon (rising education within a country changes the
regime effect). The Mundlak / REWB split
(reg × edt (within) vs reg × edt (between))
makes which story is doing the work explicit.reg_lag, reg_lag × edt_lag) tell us
whether democracy’s effect — and its moderation — is contemporaneous or
spreads over a year.This robustness pattern — “the within story is consistent across FE/TWFE/REWB; the between story is visible in pooled OLS and REWB’s between block” — is exactly why Kropko & Kubinec (2020) tell us to state the comparison our coefficient is making, rather than picking one estimator and pretending the others were wrong.
Lab 4 introduced parametric simulation for a single time series. Part 3 extends the same logic to panels, but we keep the estimand fixed-effects-centered:
These are different quantities. The first is a future forecast. The second is an average within-sample contrast. TWFE is much easier to defend for the second quantity because the year fixed effects are already observed; future year effects are not.
Part 2 already fit the dynamic country-FE and TWFE models we need here. We just pull out the persistence coefficients for reference:
phi_fe <- coef(fit_fe_dyn)["ln_gdpw_lag"]
phi_twfe <- coef(fit_twfe_dyn)["ln_gdpw_lag"]
sprintf("phi_fe = %.3f, phi_twfe = %.3f", phi_fe, phi_twfe)
## [1] "phi_fe = 0.944, phi_twfe = 0.927"
Nickell (1981) bias caveat. With small \(T\) and country FE, the within-FE estimate of \(\hat\phi\) is biased downward by approximately \(-(1+\phi)/(T-1)\). For our \(T_i \in [10, 30]\), the bias is not trivial. Lab 6 will introduce dynamic-panel GMM to address it. For today, the goal is interpretation and simulation, not a final causal estimate.
The first counterfactual question is:
If one historically authoritarian country democratized permanently after the observed sample, what would its expected log-GDP-per-worker path look like?
We choose an always-autocratic country with the longest observed history. That gives us a clean baseline and a well-defined last observed value for the dynamic recursion.
last_year <- max(democracy$year)
maxT <- max(table(democracy$country))
ever_aut <- democracy |>
group_by(country, ctyname) |>
summarise(always_aut = all(reg == 0, na.rm = TRUE),
n_obs = n(),
n_yreg = sum(!is.na(ln_gdpw) & !is.na(reg)),
last_obs = max(year), .groups = "drop") |>
filter(always_aut, n_obs == maxT, n_yreg == maxT, last_obs == last_year)
focus_country <- ever_aut$ctyname[1]
focus_id <- ever_aut$country[1]
sprintf("Focus country: %s (id = %d)", focus_country, focus_id)
## [1] "Focus country: Egypt (id = 14)"
H <- 20 # forecast horizon (years)
focus_df <- democracy |> filter(country == focus_id) |> arrange(year)
y_anchor <- tail(focus_df$ln_gdpw, 1) # last observed log GDP
edt_hold <- tail(na.omit(focus_df$edt), 1) # last non-NA education
sprintf("y_anchor = %.3f, edt_hold = %.3f", y_anchor, edt_hold)
## [1] "y_anchor = 8.838, edt_hold = 5.390"
mvrnormThe dynamic country-FE recursion is:
\[ y_t = \hat\alpha_i + \hat\phi y_{t-1} + \hat\beta_{\text{reg}}\cdot 1 + \hat\beta_{\text{edt}}\cdot \text{edt}_T + \hat\beta_{\text{int}}\cdot 1\cdot \text{edt}_T. \]
The counterfactual fixes reg = 1 for every forecast
year, holds education at the last observed value, and initializes the
recursion at the last observed log GDP.
set.seed(2025)
S <- 1000
alpha_fe <- fixef(fit_fe_dyn)$country[as.character(focus_id)]
draws_fe <- mvrnorm(n = S, mu = coef(fit_fe_dyn), Sigma = vcov(fit_fe_dyn))
# Store one simulated forecast path in each row.
sim_fe_manual <- matrix(NA_real_, nrow = S, ncol = H)
for (s in seq_len(S)) {
# Extract one draw of the dynamic FE coefficients.
b <- draws_fe[s, ]
# Start the recursion at the last observed outcome.
y_lag <- y_anchor
for (h in seq_len(H)) {
# Expected value in forecast year h under permanent democratization.
# The stochastic innovation has expectation zero, so this is an expected
# counterfactual path rather than a predicted value path.
y_now <- alpha_fe +
b["ln_gdpw_lag"] * y_lag +
b["reg"] * 1 +
b["edt"] * edt_hold +
b["reg:edt"] * 1 * edt_hold
# Save and update the lag for the next forecast year.
sim_fe_manual[s, h] <- y_now
y_lag <- y_now
}
}
fe_manual_band <- tibble(
year = (last_year + 1):(last_year + H),
pe = apply(sim_fe_manual, 2, mean),
lo = apply(sim_fe_manual, 2, quantile, 0.025),
hi = apply(sim_fe_manual, 2, quantile, 0.975),
method = "Manual FE"
)
We hold \(\hat\alpha_i\) fixed across simulations. That keeps the example focused on coefficient uncertainty and dynamic propagation. A more advanced version could also simulate uncertainty in the estimated country intercept.
simcfldvsimev() can reproduce the same expected path if we
pass the country FE as the model constant. The lagged dependent variable
coefficient is passed separately as phi, so it should not
appear in the simulated coefficient matrix b.
# Build the hypothetical-X matrix from the non-lagged covariates.
formula_fe_static <- ln_gdpw ~ reg * edt
xhyp_fe <- cfMake(formula_fe_static, data = democracy_dyn, nscen = H)
# Set the same permanent-democracy scenario for each forecast year.
for (h in seq_len(H)) {
xhyp_fe <- cfChange(xhyp_fe, "reg", x = 1, scen = h)
xhyp_fe <- cfChange(xhyp_fe, "edt", x = edt_hold, scen = h)
}
# Put the country intercept in the first column of b. This makes it the
# "constant" for this particular country-specific forecast.
coef_fe <- coef(fit_fe_dyn)
coef_fe_x <- coef_fe[names(coef_fe) != "ln_gdpw_lag"]
draws_fe_x <- draws_fe[, names(coef_fe_x), drop = FALSE]
phi_fe_draws <- draws_fe[, "ln_gdpw_lag", drop = FALSE]
simbetas_fe <- cbind(alpha_fe = rep(alpha_fe, S), draws_fe_x)
sim_fe_obj <- ldvsimev(
x = xhyp_fe,
b = simbetas_fe,
ci = 0.95,
constant = 1,
phi = phi_fe_draws,
lagY = y_anchor
)
fe_simcf_band <- tibble(
year = (last_year + 1):(last_year + H),
pe = sim_fe_obj$pe,
lo = sim_fe_obj$lower,
hi = sim_fe_obj$upper,
method = "simcf FE"
)
hist_df <- focus_df |>
transmute(year, observed = ln_gdpw)
forecast_bands <- bind_rows(fe_manual_band, fe_simcf_band)
ggplot() +
geom_line(data = hist_df, aes(year, observed), colour = "grey20") +
geom_ribbon(data = forecast_bands,
aes(year, ymin = lo, ymax = hi, fill = method),
alpha = 0.22) +
geom_line(data = forecast_bands,
aes(year, pe, colour = method), linewidth = 0.75) +
geom_vline(xintercept = last_year, linetype = "dashed") +
labs(x = NULL, y = "log GDP per worker",
title = paste0("Counterfactual democratization, ", focus_country),
subtitle = "Dashed line: end of observed sample. Scenario sets reg = 1 for every forecast year.",
colour = NULL, fill = NULL) +
theme_minimal(base_size = 11) +
theme(legend.position = "bottom")
One-country counterfactual forecast under dynamic country fixed effects. The manual simulation and simcf reproduce the same expected-value logic.
The §3.3–3.4 forecast plotted the level of \(y\) under permanent democratization. The
same dynamic, expressed as a difference vs a no-change
counterfactual baseline, is an impulse response function
(IRF) — the panel analog of Lab 4’s IRF. We use the
FE-ARDL(1,1) model from §2.2 because its
lagged-treatment terms reg_lag and
reg_lag × edt_lag are the natural channel to display
moderation by education over time, not just at impact.
Closed form. The ARDL(1,1) recursion under permanent democratization at \(t = 0\), with education held at value \(e\), gives
\[ \mathrm{IRF}(h \mid e) \;=\; A(e) \cdot \frac{1 - \phi^{h+1}}{1 - \phi} \;+\; B(e) \cdot \frac{1 - \phi^{h}}{1 - \phi}, \]
where \(A(e) = \beta_0 + \gamma_0\,
e\) is the impact effect and \(B(e) = \beta_1 + \gamma_1\, e\) is the
delayed transmission through reg_lag and
reg_lag × edt_lag. (\(\beta_0\) is the coefficient on
reg, \(\beta_1\) on
reg_lag, \(\gamma_0\) on
reg × edt, \(\gamma_1\) on
reg_lag × edt_lag, \(\phi\) on ln_gdpw_lag. Holding
edt constant means \(\text{edt}_t
= \text{edt}_{t-1} = e\) throughout the counterfactual.)
Long-run equilibrium effect (LRE). As \(h \to \infty\), assuming \(|\phi| < 1\):
\[ \mathrm{LRE}(e) \;=\; \frac{A(e) + B(e)}{1 - \phi} \;=\; \frac{\beta_0 + \beta_1 + (\gamma_0 + \gamma_1)\,e}{1 - \phi}. \]
Implementation. Parametric simulation: draw \(S = 1{,}000\) coefficient vectors from
MVN(coef, vcov), evaluate the closed-form IRF at each draw and each
horizon, take the 95% band over draws. We compute IRF and LRE at two
education levels — the 25th and 75th percentile of edt in
the panel — so the moderation by education is visible.
# Closed-form IRF for one parameter draw, one edt value, horizons 0..H.
# IRF(h | e) = A(e) * (1 - phi^(h+1))/(1 - phi) + B(e) * (1 - phi^h)/(1 - phi)
# with B's term zero at h = 0 (no lagged-treatment contribution at impact).
irf_path <- function(e, phi, beta0, beta1, gamma0, gamma1, H = 20) {
A <- beta0 + gamma0 * e
B <- beta1 + gamma1 * e
h_seq <- 0:H
geom_h_plus_1 <- (1 - phi^(h_seq + 1)) / (1 - phi)
geom_h <- ifelse(h_seq == 0, 0, (1 - phi^h_seq) / (1 - phi))
A * geom_h_plus_1 + B * geom_h
}
# Two education levels: 25th and 75th percentiles of observed edt.
edt_lo <- quantile(democracy$edt, 0.25, na.rm = TRUE)
edt_hi <- quantile(democracy$edt, 0.75, na.rm = TRUE)
sprintf("edt low (P25) = %.2f, edt high (P75) = %.2f", edt_lo, edt_hi)
## [1] "edt low (P25) = 2.21, edt high (P75) = 7.08"
# Parametric simulation from FE-ARDL(1,1) joint coefficient distribution.
set.seed(2025)
S <- 1000
H_irf <- 20
draws_ardl <- mvrnorm(n = S,
mu = coef(fit_fe_ardl11),
Sigma = vcov(fit_fe_ardl11))
# Evaluate IRF for each draw, at edt_lo and edt_hi.
irf_lo <- matrix(NA_real_, S, H_irf + 1)
irf_hi <- matrix(NA_real_, S, H_irf + 1)
for (s in seq_len(S)) {
d <- draws_ardl[s, ]
irf_lo[s, ] <- irf_path(e = edt_lo,
phi = d["ln_gdpw_lag"],
beta0 = d["reg"],
beta1 = d["reg_lag"],
gamma0 = d["reg:edt"],
gamma1 = d["reg_lag:edt_lag"],
H = H_irf)
irf_hi[s, ] <- irf_path(e = edt_hi,
phi = d["ln_gdpw_lag"],
beta0 = d["reg"],
beta1 = d["reg_lag"],
gamma0 = d["reg:edt"],
gamma1 = d["reg_lag:edt_lag"],
H = H_irf)
}
irf_band <- bind_rows(
tibble(h = 0:H_irf,
pe = colMeans(irf_lo),
lo = apply(irf_lo, 2, quantile, 0.025),
hi = apply(irf_lo, 2, quantile, 0.975),
scenario = sprintf("Low education (edt = %.2f, P25)", edt_lo)),
tibble(h = 0:H_irf,
pe = colMeans(irf_hi),
lo = apply(irf_hi, 2, quantile, 0.025),
hi = apply(irf_hi, 2, quantile, 0.975),
scenario = sprintf("High education (edt = %.2f, P75)", edt_hi))
)
ggplot(irf_band, aes(h, pe, colour = scenario, fill = scenario)) +
geom_ribbon(aes(ymin = lo, ymax = hi), alpha = 0.22, colour = NA) +
geom_line(linewidth = 0.75) +
geom_hline(yintercept = 0, linetype = "dashed", colour = "grey45") +
labs(x = "Horizon h (years after democratization)",
y = "Cumulative effect on log GDP per worker",
title = "Impulse response: permanent democratization, FE-ARDL(1,1)",
colour = NULL, fill = NULL) +
theme_minimal(base_size = 11) +
theme(legend.position = "bottom")
Impulse response function: cumulative effect of permanent democratization on log GDP per worker, from the FE-ARDL(1,1) model. 95% parametric simulation bands. The asymptote of each curve is the long-run equilibrium effect (LRE).
# Long-run equilibrium effect (LRE) for each edt value, with simulation band.
# LRE(e) = (beta_0 + beta_1 + (gamma_0 + gamma_1) * e) / (1 - phi).
lre_lo_draws <- (draws_ardl[, "reg"] + draws_ardl[, "reg_lag"] +
(draws_ardl[, "reg:edt"] + draws_ardl[, "reg_lag:edt_lag"]) * edt_lo) /
(1 - draws_ardl[, "ln_gdpw_lag"])
lre_hi_draws <- (draws_ardl[, "reg"] + draws_ardl[, "reg_lag"] +
(draws_ardl[, "reg:edt"] + draws_ardl[, "reg_lag:edt_lag"]) * edt_hi) /
(1 - draws_ardl[, "ln_gdpw_lag"])
lre_summary <- tibble(
scenario = c(sprintf("Low education (edt = %.2f, P25)", edt_lo),
sprintf("High education (edt = %.2f, P75)", edt_hi)),
pe = c(mean(lre_lo_draws), mean(lre_hi_draws)),
lo = c(quantile(lre_lo_draws, 0.025), quantile(lre_hi_draws, 0.025)),
hi = c(quantile(lre_lo_draws, 0.975), quantile(lre_hi_draws, 0.975))
)
lre_summary |>
kable(digits = 3,
caption = "Long-run equilibrium effect of permanent democratization, FE-ARDL(1,1). 95% parametric simulation interval.") |>
kable_styling(full_width = FALSE)
| scenario | pe | lo | hi |
|---|---|---|---|
| Low education (edt = 2.21, P25) | -0.222 | -0.556 | 0.083 |
| High education (edt = 7.08, P75) | -0.052 | -0.370 | 0.269 |
Reading.
reg_lag and
reg_lag × edt_lag, plus one round of \(\phi\)-driven persistence on the impact
effect.edt curve sits above the low-edt curve,
more-educated workforces benefit more from democratization in the long
run — a substantively meaningful within-country moderation by human
capital. If they overlap, the moderation visible in the static
reg × edt coefficient is a between-country phenomenon that
disappears once we model dynamics within country.