Skip to content

Complete GMM Guide

Quick Reference

Module: panelbox.gmm Key classes: DifferenceGMM, SystemGMM Stata equivalent: xtabond2 R equivalent: plm::pgmm()

What You Will Learn

This step-by-step guide walks through the entire GMM workflow:

  1. Understanding when and why you need GMM
  2. Preparing data and checking panel structure
  3. Estimating Difference and System GMM
  4. Running and interpreting all diagnostic tests
  5. Choosing between models
  6. Reporting results for publication

Prerequisites: Basic familiarity with panel data and regression analysis.

Part 1: Understanding the Problem

The Dynamic Panel Challenge

Consider a typical dynamic panel data model where the outcome depends on its own past value:

\[y_{it} = \gamma \, y_{i,t-1} + X_{it}'\beta + \alpha_i + \varepsilon_{it}\]

where:

  • \(y_{it}\): Dependent variable (e.g., log employment for firm \(i\) at time \(t\))
  • \(y_{i,t-1}\): Lagged dependent variable (captures persistence)
  • \(X_{it}\): Exogenous regressors (e.g., wages, capital stock)
  • \(\alpha_i\): Unobserved individual fixed effect
  • \(\varepsilon_{it}\): Idiosyncratic error term

The fundamental problem: Standard estimators all fail in this setup.

Why Standard Estimators Fail

Pooled OLS ignores \(\alpha_i\), creating omitted variable bias. Since \(\alpha_i\) affects all periods, \(y_{i,t-1}\) is correlated with \(\alpha_i\). Result: OLS overestimates \(\gamma\) (biased upward).

Fixed Effects (Within) removes \(\alpha_i\) by demeaning, but the demeaned lagged variable \((y_{i,t-1} - \bar{y}_i)\) is correlated with the demeaned error \((\varepsilon_{it} - \bar{\varepsilon}_i)\), because \(\bar{y}_i\) contains future values including \(y_{it}\). Result: FE underestimates \(\gamma\) (Nickell 1981 bias of approximately \(-(1+\gamma)/(T-1)\)).

Random Effects requires strict exogeneity, which is violated by construction when \(y_{i,t-1}\) appears as a regressor (it depends on past shocks). Result: RE is inconsistent.

The Estimator Ranking

For any true coefficient \(\gamma\):

\[\hat{\gamma}_{OLS} > \gamma_{true} > \hat{\gamma}_{FE}\]

This provides bounds for validating GMM estimates. A valid GMM estimate should fall between OLS and FE.

When You Need GMM

GMM is the appropriate estimator when you have:

  • A lagged dependent variable as a regressor
  • Individual fixed effects (\(\alpha_i\))
  • Short panel (T < 20, often T = 5-10)
  • Large N (many cross-sectional units)

If T is large (T > 30), Fixed Effects bias becomes negligible and FE may suffice.


Part 2: The GMM Solution

The Key Insight

GMM eliminates \(\alpha_i\) through first-differencing instead of demeaning:

\[\Delta y_{it} = \gamma \, \Delta y_{i,t-1} + \Delta X_{it}'\beta + \Delta \varepsilon_{it}\]

This removes \(\alpha_i\) without creating the Nickell bias. However, \(\Delta y_{i,t-1}\) is still correlated with \(\Delta \varepsilon_{it}\) (because both contain \(\varepsilon_{i,t-1}\)), so we need instruments.

Moment Conditions

The central moment condition for Difference GMM:

\[E[y_{i,t-s} \cdot \Delta \varepsilon_{it}] = 0 \quad \text{for } s \geq 2\]

Past values of \(y\) (at least 2 periods ago) are valid instruments because they are predetermined and uncorrelated with the differenced error.

Difference vs System GMM

Difference GMM (Arellano-Bond 1991):

  • First-differences the equation
  • Uses lagged levels as instruments
  • Fewer assumptions, more robust
  • Can have weak instruments when series are persistent

System GMM (Blundell-Bond 1998):

  • Combines difference and level equations
  • Uses lagged differences as instruments for the level equation
  • More efficient when \(\gamma > 0.8\)
  • Requires stationarity of initial conditions
Criterion Difference GMM System GMM
Persistence (\(\gamma\)) < 0.8 > 0.8
Stationarity required No Yes
Instrument strength Weaker for persistent data Stronger
Efficiency Lower Higher
Robustness Higher Lower (more assumptions)

Part 3: Hands-On Example

Step 1: Load and Inspect Data

import pandas as pd
import numpy as np
from panelbox.gmm import DifferenceGMM, SystemGMM
from panelbox.datasets import load_abdata

# Load Arellano-Bond employment dataset
data = load_abdata()

# Inspect panel structure
print(f"Number of firms (N): {data['id'].nunique()}")
print(f"Number of years (T): {data['year'].nunique()}")
print(f"Total observations: {len(data)}")

# Check for missing values
print(f"\nMissing values:\n{data[['n', 'w', 'k']].isnull().sum()}")

# Descriptive statistics
print(f"\nDescriptive statistics:")
print(data[["n", "w", "k"]].describe())

Step 2: Establish OLS/FE Bounds

Before running GMM, establish the credible range for the AR coefficient:

# Create lagged dependent variable
df = data.sort_values(["id", "year"]).copy()
df["n_lag"] = df.groupby("id")["n"].shift(1)
df_clean = df.dropna(subset=["n_lag"])

# --- Pooled OLS (upper bound) ---
X_ols = np.column_stack([
    np.ones(len(df_clean)),
    df_clean[["n_lag", "w", "k"]].values,
])
y_ols = df_clean["n"].values
beta_ols = np.linalg.lstsq(X_ols, y_ols, rcond=None)[0]
gamma_ols = beta_ols[1]

# --- Fixed Effects (lower bound) ---
groups = df_clean["id"].values
y_dm = df_clean["n"].values.astype(float).copy()
X_dm = df_clean[["n_lag", "w", "k"]].values.astype(float).copy()

for g in np.unique(groups):
    mask = groups == g
    y_dm[mask] -= y_dm[mask].mean()
    X_dm[mask] -= X_dm[mask].mean(axis=0)

beta_fe = np.linalg.lstsq(X_dm, y_dm, rcond=None)[0]
gamma_fe = beta_fe[0]

print(f"Credible range for persistence coefficient:")
print(f"  FE  (lower bound): {gamma_fe:.4f}")
print(f"  OLS (upper bound): {gamma_ols:.4f}")
print(f"  Range width: {gamma_ols - gamma_fe:.4f}")

if gamma_ols - gamma_fe > 0.1:
    print("\nLarge gap -- GMM is recommended")
else:
    print("\nSmall gap -- GMM may not be necessary")

Step 3: Estimate Difference GMM

gmm_diff = DifferenceGMM(
    data=data,
    dep_var="n",
    lags=1,
    id_var="id",
    time_var="year",
    exog_vars=["w", "k"],
    time_dummies=True,
    collapse=True,
    two_step=True,
    robust=True,
)

results_diff = gmm_diff.fit()
print(results_diff.summary())

Step 4: Interpret Coefficients

gamma_diff = results_diff.params["L1.n"]
se_diff = results_diff.std_errors["L1.n"]

print(f"\nDifference GMM Results:")
print(f"  Persistence (L1.n): {gamma_diff:.4f} (SE: {se_diff:.4f})")
print(f"  95% CI: [{gamma_diff - 1.96*se_diff:.4f}, {gamma_diff + 1.96*se_diff:.4f}]")

# Credibility check
in_range = gamma_fe < gamma_diff < gamma_ols
print(f"\n  Within [{gamma_fe:.4f}, {gamma_ols:.4f}]? {in_range}")

if not in_range:
    print("  WARNING: Estimate outside credible bounds")

Step 5: Run Diagnostic Tests

print("\n" + "=" * 70)
print("DIAGNOSTIC TESTS")
print("=" * 70)

# 1. AR(2) -- MOST CRITICAL
ar2 = results_diff.ar2_test
print(f"\n1. AR(2) test (H0: no 2nd-order autocorrelation)")
print(f"   z = {ar2.statistic:.3f}, p = {ar2.pvalue:.4f}")
if ar2.pvalue > 0.10:
    print("   PASS: Moment conditions valid")
else:
    print("   FAIL: Moment conditions rejected -- GMM invalid!")

# 2. Hansen J -- instrument validity
hansen = results_diff.hansen_j
print(f"\n2. Hansen J test (H0: instruments valid)")
print(f"   stat = {hansen.statistic:.3f}, p = {hansen.pvalue:.4f}, df = {hansen.df}")
if 0.10 < hansen.pvalue < 0.25:
    print("   PASS: Instruments appear valid (ideal range)")
elif hansen.pvalue < 0.10:
    print("   FAIL: Instruments rejected")
elif hansen.pvalue > 0.50:
    print("   WARNING: p-value very high -- possible weak instruments")
else:
    print("   PASS: Instruments acceptable")

# 3. AR(1) -- informational
ar1 = results_diff.ar1_test
print(f"\n3. AR(1) test (expected to reject)")
print(f"   z = {ar1.statistic:.3f}, p = {ar1.pvalue:.4f}")
if ar1.pvalue < 0.10:
    print("   EXPECTED: MA(1) structure from first-differencing")

# 4. Instrument count
print(f"\n4. Instrument diagnostics")
print(f"   Observations: {results_diff.nobs}")
print(f"   Groups: {results_diff.n_groups}")
print(f"   Instruments: {results_diff.n_instruments}")
print(f"   Ratio: {results_diff.instrument_ratio:.3f}")
if results_diff.instrument_ratio < 1.0:
    print("   PASS: Instrument count appropriate")
else:
    print("   WARNING: Too many instruments")

Step 6: Estimate System GMM

gmm_sys = SystemGMM(
    data=data,
    dep_var="n",
    lags=1,
    id_var="id",
    time_var="year",
    exog_vars=["w", "k"],
    time_dummies=True,
    collapse=True,
    two_step=True,
    robust=True,
    level_instruments={"max_lags": 1},
)

results_sys = gmm_sys.fit()
print(results_sys.summary())

gamma_sys = results_sys.params["L1.n"]
se_sys = results_sys.std_errors["L1.n"]

# Compare efficiency
efficiency_gain = (se_diff - se_sys) / se_diff * 100

print(f"\nComparison of Estimates:")
print(f"  OLS:            {gamma_ols:.4f} (upper bound)")
print(f"  System GMM:     {gamma_sys:.4f} (SE: {se_sys:.4f})")
print(f"  Difference GMM: {gamma_diff:.4f} (SE: {se_diff:.4f})")
print(f"  FE:             {gamma_fe:.4f} (lower bound)")
print(f"  Efficiency gain: {efficiency_gain:.1f}% SE reduction")

Step 7: Choose the Best Model

print("\n" + "=" * 70)
print("MODEL SELECTION")
print("=" * 70)

# Validity criteria
diff_valid = (
    results_diff.ar2_test.pvalue > 0.10
    and results_diff.hansen_j.pvalue > 0.10
    and results_diff.instrument_ratio < 1.0
)

sys_valid = (
    results_sys.ar2_test.pvalue > 0.10
    and results_sys.hansen_j.pvalue > 0.10
    and results_sys.instrument_ratio < 1.0
)

print(f"\nDifference GMM valid: {diff_valid}")
print(f"System GMM valid:     {sys_valid}")

if diff_valid and sys_valid:
    if se_sys < se_diff * 0.9:
        print("\nRECOMMENDATION: System GMM (more efficient and valid)")
        final_results = results_sys
    else:
        print("\nRECOMMENDATION: Difference GMM (more robust, similar efficiency)")
        final_results = results_diff
elif diff_valid:
    print("\nRECOMMENDATION: Difference GMM (System GMM fails diagnostics)")
    final_results = results_diff
elif sys_valid:
    print("\nRECOMMENDATION: System GMM (Difference GMM fails diagnostics)")
    final_results = results_sys
else:
    print("\nWARNING: Both models fail diagnostics -- revise specification")
    final_results = None

Step 8: Report Results

if final_results is not None:
    # Coefficient table
    coef_table = pd.DataFrame({
        "Coefficient": final_results.params,
        "Std. Error": final_results.std_errors,
        "z-stat": final_results.tvalues,
        "p-value": final_results.pvalues,
    })
    print("\nFinal Coefficient Table:")
    print(coef_table.to_string())

    # LaTeX export for papers
    latex = final_results.to_latex(
        caption="Dynamic Panel GMM Estimation",
        label="tab:gmm_results",
    )
    print("\nLaTeX output:")
    print(latex)

Part 4: Diagnostic Checklist

Use this checklist to validate every GMM estimation:

Essential Tests (Must Pass)

Test Criterion Why It Matters
AR(2) p > 0.10 No serial correlation Validates moment conditions
Hansen J: 0.10 < p < 0.25 Instruments valid Confirms instrument exogeneity
Instrument ratio < 1.0 No overfitting Ensures test power
Coefficient in [FE, OLS] Plausible estimate Rules out gross misspecification

Warning Signs

Symptom Likely Problem Solution
AR(2) p < 0.05 Serial correlation in levels Add lags: lags=[1, 2]
Hansen J p < 0.05 Invalid instruments Reclassify variables, reduce instruments
Hansen J p > 0.50 Weak instruments or overfitting Check ratio, try System GMM
Ratio > 1.0 Instrument proliferation collapse=True, limit gmm_max_lag
Very large SEs Weak instruments Try System GMM
Coefficient outside bounds Overfitting or misspecification Simplify model
Few observations retained Specification too complex time_dummies=False

Part 5: Advanced Topics

Predetermined vs Endogenous Variables

Not all regressors are strictly exogenous. GMM handles three types:

Type Assumption Instrument Lags Example
Strictly exogenous \(E[x_{it}\varepsilon_{is}] = 0\) for all \(s, t\) All (IV-style) Policy shocks, weather
Predetermined \(E[x_{it}\varepsilon_{is}] = 0\) for \(s \geq t\) \(t-2\) and earlier Lagged inputs
Endogenous \(E[x_{it}\varepsilon_{it}] \neq 0\) \(t-3\) and earlier Contemporaneous inputs
model = DifferenceGMM(
    data=data,
    dep_var="y",
    lags=1,
    id_var="id",
    time_var="year",
    exog_vars=["policy"],           # Strictly exogenous
    predetermined_vars=["capital"],  # Predetermined
    endogenous_vars=["labor"],       # Endogenous
    collapse=True,
    two_step=True,
)

Handling Unbalanced Panels

Unbalanced panels require extra care:

  1. Always use collapse=True -- Reduces instruments and handles sparsity
  2. Consider time_dummies=False -- Many dummies with unbalanced panels can cause identification issues
  3. Keep specifications parsimonious -- Fewer parameters means better identification
  4. Check observation retention -- Large drops signal problems
model = DifferenceGMM(
    data=unbalanced_data,
    dep_var="y",
    lags=1,
    id_var="id",
    time_var="year",
    exog_vars=["x1", "x2"],
    time_dummies=False,  # Avoid issues with unbalanced panels
    collapse=True,       # Essential for unbalanced panels
    two_step=True,
)
results = model.fit()
retention = results.nobs / len(unbalanced_data) * 100
print(f"Observation retention: {retention:.1f}%")

Limiting Instrument Depth

Control which lags are used as instruments:

# Use only lags 2 and 3 (instead of all available)
model = DifferenceGMM(
    data=data,
    dep_var="y",
    lags=1,
    exog_vars=["x1"],
    collapse=True,
    gmm_max_lag=3,   # Only y_{t-2} and y_{t-3} as instruments
    two_step=True,
)

Weak Instruments

Symptoms: Very large standard errors, Hansen J p-value > 0.50, implausible coefficients.

Solutions:

  1. Try System GMM -- additional instruments for persistent series
  2. Reduce instrument count -- focus on most relevant lags
  3. Increase sample size -- more groups or time periods
  4. Check if GMM is truly needed -- compare OLS/FE gap

Small Sample Corrections

For small N or T:

  1. Windmeijer correction -- automatically applied with two_step=True, robust=True
  2. One-step GMM -- less efficient but more robust: gmm_type="one_step"
  3. Conservative instruments -- always use collapse=True
  4. Bias correction -- consider BiasCorrectedGMM for moderate N and T

Overfitting Diagnostics

from panelbox.gmm import GMMOverfitDiagnostic

diag = GMMOverfitDiagnostic(model, results)

# Full diagnostic report with traffic-light signals
print(diag.summary())

# Individual checks
feas = diag.assess_feasibility()
bounds = diag.coefficient_bounds_test()
step = diag.step_comparison()

See Instruments for a detailed guide to the GMMOverfitDiagnostic class.


Summary: Your GMM Workflow

  1. Check if you need GMM: Lagged dependent variable + fixed effects + small T
  2. Establish OLS/FE bounds: Credible range for the AR coefficient
  3. Start with Difference GMM: collapse=True, two_step=True, robust=True
  4. Run full diagnostics: AR(2), Hansen J, instrument ratio, bounds check
  5. Try System GMM if series is persistent or Difference GMM has large SEs
  6. Choose the best model: Valid diagnostics + smaller standard errors
  7. Report results: Include all diagnostic tests in your paper

Tutorials

Tutorial Description Link
Difference GMM Arellano-Bond estimator details Difference GMM
System GMM Blundell-Bond estimator details System GMM
CUE-GMM Continuous Updating Estimator CUE-GMM
Bias-Corrected GMM Analytical bias correction Bias-Corrected
Instruments Instrument selection and management Instruments
Diagnostics Complete diagnostic test guide Diagnostics

See Also

References

  1. Arellano, M., & Bond, S. (1991). "Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations." Review of Economic Studies, 58(2), 277-297.
  2. Blundell, R., & Bond, S. (1998). "Initial Conditions and Moment Restrictions in Dynamic Panel Data Models." Journal of Econometrics, 87(1), 115-143.
  3. Windmeijer, F. (2005). "A Finite Sample Correction for the Variance of Linear Efficient Two-Step GMM Estimators." Journal of Econometrics, 126(1), 25-51.
  4. Roodman, D. (2009). "How to do xtabond2: An Introduction to Difference and System GMM in Stata." The Stata Journal, 9(1), 86-136.
  5. Nickell, S. (1981). "Biases in Dynamic Models with Fixed Effects." Econometrica, 49(6), 1417-1426.
  6. Bond, S. R. (2002). "Dynamic Panel Data Models: A Guide to Micro Data Methods and Practice." Portuguese Economic Journal, 1(2), 141-162.
  7. Baltagi, B. H. (2021). Econometric Analysis of Panel Data (6th ed.). Springer.
  8. Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press.