Complete GMM Guide¶
Quick Reference
Module: panelbox.gmm
Key classes: DifferenceGMM, SystemGMM
Stata equivalent: xtabond2
R equivalent: plm::pgmm()
What You Will Learn
This step-by-step guide walks through the entire GMM workflow:
- Understanding when and why you need GMM
- Preparing data and checking panel structure
- Estimating Difference and System GMM
- Running and interpreting all diagnostic tests
- Choosing between models
- Reporting results for publication
Prerequisites: Basic familiarity with panel data and regression analysis.
Part 1: Understanding the Problem¶
The Dynamic Panel Challenge¶
Consider a typical dynamic panel data model where the outcome depends on its own past value:
where:
- \(y_{it}\): Dependent variable (e.g., log employment for firm \(i\) at time \(t\))
- \(y_{i,t-1}\): Lagged dependent variable (captures persistence)
- \(X_{it}\): Exogenous regressors (e.g., wages, capital stock)
- \(\alpha_i\): Unobserved individual fixed effect
- \(\varepsilon_{it}\): Idiosyncratic error term
The fundamental problem: Standard estimators all fail in this setup.
Why Standard Estimators Fail¶
Pooled OLS ignores \(\alpha_i\), creating omitted variable bias. Since \(\alpha_i\) affects all periods, \(y_{i,t-1}\) is correlated with \(\alpha_i\). Result: OLS overestimates \(\gamma\) (biased upward).
Fixed Effects (Within) removes \(\alpha_i\) by demeaning, but the demeaned lagged variable \((y_{i,t-1} - \bar{y}_i)\) is correlated with the demeaned error \((\varepsilon_{it} - \bar{\varepsilon}_i)\), because \(\bar{y}_i\) contains future values including \(y_{it}\). Result: FE underestimates \(\gamma\) (Nickell 1981 bias of approximately \(-(1+\gamma)/(T-1)\)).
Random Effects requires strict exogeneity, which is violated by construction when \(y_{i,t-1}\) appears as a regressor (it depends on past shocks). Result: RE is inconsistent.
The Estimator Ranking¶
For any true coefficient \(\gamma\):
This provides bounds for validating GMM estimates. A valid GMM estimate should fall between OLS and FE.
When You Need GMM¶
GMM is the appropriate estimator when you have:
- A lagged dependent variable as a regressor
- Individual fixed effects (\(\alpha_i\))
- Short panel (T < 20, often T = 5-10)
- Large N (many cross-sectional units)
If T is large (T > 30), Fixed Effects bias becomes negligible and FE may suffice.
Part 2: The GMM Solution¶
The Key Insight¶
GMM eliminates \(\alpha_i\) through first-differencing instead of demeaning:
This removes \(\alpha_i\) without creating the Nickell bias. However, \(\Delta y_{i,t-1}\) is still correlated with \(\Delta \varepsilon_{it}\) (because both contain \(\varepsilon_{i,t-1}\)), so we need instruments.
Moment Conditions¶
The central moment condition for Difference GMM:
Past values of \(y\) (at least 2 periods ago) are valid instruments because they are predetermined and uncorrelated with the differenced error.
Difference vs System GMM¶
Difference GMM (Arellano-Bond 1991):
- First-differences the equation
- Uses lagged levels as instruments
- Fewer assumptions, more robust
- Can have weak instruments when series are persistent
System GMM (Blundell-Bond 1998):
- Combines difference and level equations
- Uses lagged differences as instruments for the level equation
- More efficient when \(\gamma > 0.8\)
- Requires stationarity of initial conditions
| Criterion | Difference GMM | System GMM |
|---|---|---|
| Persistence (\(\gamma\)) | < 0.8 | > 0.8 |
| Stationarity required | No | Yes |
| Instrument strength | Weaker for persistent data | Stronger |
| Efficiency | Lower | Higher |
| Robustness | Higher | Lower (more assumptions) |
Part 3: Hands-On Example¶
Step 1: Load and Inspect Data¶
import pandas as pd
import numpy as np
from panelbox.gmm import DifferenceGMM, SystemGMM
from panelbox.datasets import load_abdata
# Load Arellano-Bond employment dataset
data = load_abdata()
# Inspect panel structure
print(f"Number of firms (N): {data['id'].nunique()}")
print(f"Number of years (T): {data['year'].nunique()}")
print(f"Total observations: {len(data)}")
# Check for missing values
print(f"\nMissing values:\n{data[['n', 'w', 'k']].isnull().sum()}")
# Descriptive statistics
print(f"\nDescriptive statistics:")
print(data[["n", "w", "k"]].describe())
Step 2: Establish OLS/FE Bounds¶
Before running GMM, establish the credible range for the AR coefficient:
# Create lagged dependent variable
df = data.sort_values(["id", "year"]).copy()
df["n_lag"] = df.groupby("id")["n"].shift(1)
df_clean = df.dropna(subset=["n_lag"])
# --- Pooled OLS (upper bound) ---
X_ols = np.column_stack([
np.ones(len(df_clean)),
df_clean[["n_lag", "w", "k"]].values,
])
y_ols = df_clean["n"].values
beta_ols = np.linalg.lstsq(X_ols, y_ols, rcond=None)[0]
gamma_ols = beta_ols[1]
# --- Fixed Effects (lower bound) ---
groups = df_clean["id"].values
y_dm = df_clean["n"].values.astype(float).copy()
X_dm = df_clean[["n_lag", "w", "k"]].values.astype(float).copy()
for g in np.unique(groups):
mask = groups == g
y_dm[mask] -= y_dm[mask].mean()
X_dm[mask] -= X_dm[mask].mean(axis=0)
beta_fe = np.linalg.lstsq(X_dm, y_dm, rcond=None)[0]
gamma_fe = beta_fe[0]
print(f"Credible range for persistence coefficient:")
print(f" FE (lower bound): {gamma_fe:.4f}")
print(f" OLS (upper bound): {gamma_ols:.4f}")
print(f" Range width: {gamma_ols - gamma_fe:.4f}")
if gamma_ols - gamma_fe > 0.1:
print("\nLarge gap -- GMM is recommended")
else:
print("\nSmall gap -- GMM may not be necessary")
Step 3: Estimate Difference GMM¶
gmm_diff = DifferenceGMM(
data=data,
dep_var="n",
lags=1,
id_var="id",
time_var="year",
exog_vars=["w", "k"],
time_dummies=True,
collapse=True,
two_step=True,
robust=True,
)
results_diff = gmm_diff.fit()
print(results_diff.summary())
Step 4: Interpret Coefficients¶
gamma_diff = results_diff.params["L1.n"]
se_diff = results_diff.std_errors["L1.n"]
print(f"\nDifference GMM Results:")
print(f" Persistence (L1.n): {gamma_diff:.4f} (SE: {se_diff:.4f})")
print(f" 95% CI: [{gamma_diff - 1.96*se_diff:.4f}, {gamma_diff + 1.96*se_diff:.4f}]")
# Credibility check
in_range = gamma_fe < gamma_diff < gamma_ols
print(f"\n Within [{gamma_fe:.4f}, {gamma_ols:.4f}]? {in_range}")
if not in_range:
print(" WARNING: Estimate outside credible bounds")
Step 5: Run Diagnostic Tests¶
print("\n" + "=" * 70)
print("DIAGNOSTIC TESTS")
print("=" * 70)
# 1. AR(2) -- MOST CRITICAL
ar2 = results_diff.ar2_test
print(f"\n1. AR(2) test (H0: no 2nd-order autocorrelation)")
print(f" z = {ar2.statistic:.3f}, p = {ar2.pvalue:.4f}")
if ar2.pvalue > 0.10:
print(" PASS: Moment conditions valid")
else:
print(" FAIL: Moment conditions rejected -- GMM invalid!")
# 2. Hansen J -- instrument validity
hansen = results_diff.hansen_j
print(f"\n2. Hansen J test (H0: instruments valid)")
print(f" stat = {hansen.statistic:.3f}, p = {hansen.pvalue:.4f}, df = {hansen.df}")
if 0.10 < hansen.pvalue < 0.25:
print(" PASS: Instruments appear valid (ideal range)")
elif hansen.pvalue < 0.10:
print(" FAIL: Instruments rejected")
elif hansen.pvalue > 0.50:
print(" WARNING: p-value very high -- possible weak instruments")
else:
print(" PASS: Instruments acceptable")
# 3. AR(1) -- informational
ar1 = results_diff.ar1_test
print(f"\n3. AR(1) test (expected to reject)")
print(f" z = {ar1.statistic:.3f}, p = {ar1.pvalue:.4f}")
if ar1.pvalue < 0.10:
print(" EXPECTED: MA(1) structure from first-differencing")
# 4. Instrument count
print(f"\n4. Instrument diagnostics")
print(f" Observations: {results_diff.nobs}")
print(f" Groups: {results_diff.n_groups}")
print(f" Instruments: {results_diff.n_instruments}")
print(f" Ratio: {results_diff.instrument_ratio:.3f}")
if results_diff.instrument_ratio < 1.0:
print(" PASS: Instrument count appropriate")
else:
print(" WARNING: Too many instruments")
Step 6: Estimate System GMM¶
gmm_sys = SystemGMM(
data=data,
dep_var="n",
lags=1,
id_var="id",
time_var="year",
exog_vars=["w", "k"],
time_dummies=True,
collapse=True,
two_step=True,
robust=True,
level_instruments={"max_lags": 1},
)
results_sys = gmm_sys.fit()
print(results_sys.summary())
gamma_sys = results_sys.params["L1.n"]
se_sys = results_sys.std_errors["L1.n"]
# Compare efficiency
efficiency_gain = (se_diff - se_sys) / se_diff * 100
print(f"\nComparison of Estimates:")
print(f" OLS: {gamma_ols:.4f} (upper bound)")
print(f" System GMM: {gamma_sys:.4f} (SE: {se_sys:.4f})")
print(f" Difference GMM: {gamma_diff:.4f} (SE: {se_diff:.4f})")
print(f" FE: {gamma_fe:.4f} (lower bound)")
print(f" Efficiency gain: {efficiency_gain:.1f}% SE reduction")
Step 7: Choose the Best Model¶
print("\n" + "=" * 70)
print("MODEL SELECTION")
print("=" * 70)
# Validity criteria
diff_valid = (
results_diff.ar2_test.pvalue > 0.10
and results_diff.hansen_j.pvalue > 0.10
and results_diff.instrument_ratio < 1.0
)
sys_valid = (
results_sys.ar2_test.pvalue > 0.10
and results_sys.hansen_j.pvalue > 0.10
and results_sys.instrument_ratio < 1.0
)
print(f"\nDifference GMM valid: {diff_valid}")
print(f"System GMM valid: {sys_valid}")
if diff_valid and sys_valid:
if se_sys < se_diff * 0.9:
print("\nRECOMMENDATION: System GMM (more efficient and valid)")
final_results = results_sys
else:
print("\nRECOMMENDATION: Difference GMM (more robust, similar efficiency)")
final_results = results_diff
elif diff_valid:
print("\nRECOMMENDATION: Difference GMM (System GMM fails diagnostics)")
final_results = results_diff
elif sys_valid:
print("\nRECOMMENDATION: System GMM (Difference GMM fails diagnostics)")
final_results = results_sys
else:
print("\nWARNING: Both models fail diagnostics -- revise specification")
final_results = None
Step 8: Report Results¶
if final_results is not None:
# Coefficient table
coef_table = pd.DataFrame({
"Coefficient": final_results.params,
"Std. Error": final_results.std_errors,
"z-stat": final_results.tvalues,
"p-value": final_results.pvalues,
})
print("\nFinal Coefficient Table:")
print(coef_table.to_string())
# LaTeX export for papers
latex = final_results.to_latex(
caption="Dynamic Panel GMM Estimation",
label="tab:gmm_results",
)
print("\nLaTeX output:")
print(latex)
Part 4: Diagnostic Checklist¶
Use this checklist to validate every GMM estimation:
Essential Tests (Must Pass)¶
| Test | Criterion | Why It Matters |
|---|---|---|
| AR(2) p > 0.10 | No serial correlation | Validates moment conditions |
| Hansen J: 0.10 < p < 0.25 | Instruments valid | Confirms instrument exogeneity |
| Instrument ratio < 1.0 | No overfitting | Ensures test power |
| Coefficient in [FE, OLS] | Plausible estimate | Rules out gross misspecification |
Warning Signs¶
| Symptom | Likely Problem | Solution |
|---|---|---|
| AR(2) p < 0.05 | Serial correlation in levels | Add lags: lags=[1, 2] |
| Hansen J p < 0.05 | Invalid instruments | Reclassify variables, reduce instruments |
| Hansen J p > 0.50 | Weak instruments or overfitting | Check ratio, try System GMM |
| Ratio > 1.0 | Instrument proliferation | collapse=True, limit gmm_max_lag |
| Very large SEs | Weak instruments | Try System GMM |
| Coefficient outside bounds | Overfitting or misspecification | Simplify model |
| Few observations retained | Specification too complex | time_dummies=False |
Part 5: Advanced Topics¶
Predetermined vs Endogenous Variables¶
Not all regressors are strictly exogenous. GMM handles three types:
| Type | Assumption | Instrument Lags | Example |
|---|---|---|---|
| Strictly exogenous | \(E[x_{it}\varepsilon_{is}] = 0\) for all \(s, t\) | All (IV-style) | Policy shocks, weather |
| Predetermined | \(E[x_{it}\varepsilon_{is}] = 0\) for \(s \geq t\) | \(t-2\) and earlier | Lagged inputs |
| Endogenous | \(E[x_{it}\varepsilon_{it}] \neq 0\) | \(t-3\) and earlier | Contemporaneous inputs |
model = DifferenceGMM(
data=data,
dep_var="y",
lags=1,
id_var="id",
time_var="year",
exog_vars=["policy"], # Strictly exogenous
predetermined_vars=["capital"], # Predetermined
endogenous_vars=["labor"], # Endogenous
collapse=True,
two_step=True,
)
Handling Unbalanced Panels¶
Unbalanced panels require extra care:
- Always use
collapse=True-- Reduces instruments and handles sparsity - Consider
time_dummies=False-- Many dummies with unbalanced panels can cause identification issues - Keep specifications parsimonious -- Fewer parameters means better identification
- Check observation retention -- Large drops signal problems
model = DifferenceGMM(
data=unbalanced_data,
dep_var="y",
lags=1,
id_var="id",
time_var="year",
exog_vars=["x1", "x2"],
time_dummies=False, # Avoid issues with unbalanced panels
collapse=True, # Essential for unbalanced panels
two_step=True,
)
results = model.fit()
retention = results.nobs / len(unbalanced_data) * 100
print(f"Observation retention: {retention:.1f}%")
Limiting Instrument Depth¶
Control which lags are used as instruments:
# Use only lags 2 and 3 (instead of all available)
model = DifferenceGMM(
data=data,
dep_var="y",
lags=1,
exog_vars=["x1"],
collapse=True,
gmm_max_lag=3, # Only y_{t-2} and y_{t-3} as instruments
two_step=True,
)
Weak Instruments¶
Symptoms: Very large standard errors, Hansen J p-value > 0.50, implausible coefficients.
Solutions:
- Try System GMM -- additional instruments for persistent series
- Reduce instrument count -- focus on most relevant lags
- Increase sample size -- more groups or time periods
- Check if GMM is truly needed -- compare OLS/FE gap
Small Sample Corrections¶
For small N or T:
- Windmeijer correction -- automatically applied with
two_step=True, robust=True - One-step GMM -- less efficient but more robust:
gmm_type="one_step" - Conservative instruments -- always use
collapse=True - Bias correction -- consider BiasCorrectedGMM for moderate N and T
Overfitting Diagnostics¶
from panelbox.gmm import GMMOverfitDiagnostic
diag = GMMOverfitDiagnostic(model, results)
# Full diagnostic report with traffic-light signals
print(diag.summary())
# Individual checks
feas = diag.assess_feasibility()
bounds = diag.coefficient_bounds_test()
step = diag.step_comparison()
See Instruments for a detailed guide to the GMMOverfitDiagnostic class.
Summary: Your GMM Workflow¶
- Check if you need GMM: Lagged dependent variable + fixed effects + small T
- Establish OLS/FE bounds: Credible range for the AR coefficient
- Start with Difference GMM:
collapse=True,two_step=True,robust=True - Run full diagnostics: AR(2), Hansen J, instrument ratio, bounds check
- Try System GMM if series is persistent or Difference GMM has large SEs
- Choose the best model: Valid diagnostics + smaller standard errors
- Report results: Include all diagnostic tests in your paper
Tutorials¶
| Tutorial | Description | Link |
|---|---|---|
| Difference GMM | Arellano-Bond estimator details | Difference GMM |
| System GMM | Blundell-Bond estimator details | System GMM |
| CUE-GMM | Continuous Updating Estimator | CUE-GMM |
| Bias-Corrected GMM | Analytical bias correction | Bias-Corrected |
| Instruments | Instrument selection and management | Instruments |
| Diagnostics | Complete diagnostic test guide | Diagnostics |
See Also¶
- Difference GMM -- Detailed Arellano-Bond reference
- System GMM -- Detailed Blundell-Bond reference
- Instruments -- Instrument selection and proliferation
- Diagnostics -- All diagnostic tests explained
References¶
- Arellano, M., & Bond, S. (1991). "Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations." Review of Economic Studies, 58(2), 277-297.
- Blundell, R., & Bond, S. (1998). "Initial Conditions and Moment Restrictions in Dynamic Panel Data Models." Journal of Econometrics, 87(1), 115-143.
- Windmeijer, F. (2005). "A Finite Sample Correction for the Variance of Linear Efficient Two-Step GMM Estimators." Journal of Econometrics, 126(1), 25-51.
- Roodman, D. (2009). "How to do xtabond2: An Introduction to Difference and System GMM in Stata." The Stata Journal, 9(1), 86-136.
- Nickell, S. (1981). "Biases in Dynamic Models with Fixed Effects." Econometrica, 49(6), 1417-1426.
- Bond, S. R. (2002). "Dynamic Panel Data Models: A Guide to Micro Data Methods and Practice." Portuguese Economic Journal, 1(2), 141-162.
- Baltagi, B. H. (2021). Econometric Analysis of Panel Data (6th ed.). Springer.
- Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press.