Complete GMM Guide¶

Quick Reference

Module: panelbox.gmm Key classes: DifferenceGMM, SystemGMM Stata equivalent: xtabond2 R equivalent: plm::pgmm()

What You Will Learn

This step-by-step guide walks through the entire GMM workflow:

Understanding when and why you need GMM
Preparing data and checking panel structure
Estimating Difference and System GMM
Running and interpreting all diagnostic tests
Choosing between models
Reporting results for publication

Prerequisites: Basic familiarity with panel data and regression analysis.

Part 1: Understanding the Problem¶

The Dynamic Panel Challenge¶

Consider a typical dynamic panel data model where the outcome depends on its own past value:

\[y_{it} = \gamma \, y_{i,t-1} + X_{it}'\beta + \alpha_i + \varepsilon_{it}\]

where:

\(y_{it}\): Dependent variable (e.g., log employment for firm \(i\) at time \(t\))
\(y_{i,t-1}\): Lagged dependent variable (captures persistence)
\(X_{it}\): Exogenous regressors (e.g., wages, capital stock)
\(\alpha_i\): Unobserved individual fixed effect
\(\varepsilon_{it}\): Idiosyncratic error term

The fundamental problem: Standard estimators all fail in this setup.

Why Standard Estimators Fail¶

Pooled OLS ignores \(\alpha_i\), creating omitted variable bias. Since \(\alpha_i\) affects all periods, \(y_{i,t-1}\) is correlated with \(\alpha_i\). Result: OLS overestimates \(\gamma\) (biased upward).

Fixed Effects (Within) removes \(\alpha_i\) by demeaning, but the demeaned lagged variable \((y_{i,t-1} - \bar{y}_i)\) is correlated with the demeaned error \((\varepsilon_{it} - \bar{\varepsilon}_i)\), because \(\bar{y}_i\) contains future values including \(y_{it}\). Result: FE underestimates \(\gamma\) (Nickell 1981 bias of approximately \(-(1+\gamma)/(T-1)\)).

Random Effects requires strict exogeneity, which is violated by construction when \(y_{i,t-1}\) appears as a regressor (it depends on past shocks). Result: RE is inconsistent.

The Estimator Ranking¶

For any true coefficient \(\gamma\):

\[\hat{\gamma}_{OLS} > \gamma_{true} > \hat{\gamma}_{FE}\]

This provides bounds for validating GMM estimates. A valid GMM estimate should fall between OLS and FE.

When You Need GMM¶

GMM is the appropriate estimator when you have:

A lagged dependent variable as a regressor
Individual fixed effects (\(\alpha_i\))
Short panel (T < 20, often T = 5-10)
Large N (many cross-sectional units)

If T is large (T > 30), Fixed Effects bias becomes negligible and FE may suffice.

Part 2: The GMM Solution¶

The Key Insight¶

GMM eliminates \(\alpha_i\) through first-differencing instead of demeaning:

\[\Delta y_{it} = \gamma \, \Delta y_{i,t-1} + \Delta X_{it}'\beta + \Delta \varepsilon_{it}\]

This removes \(\alpha_i\) without creating the Nickell bias. However, \(\Delta y_{i,t-1}\) is still correlated with \(\Delta \varepsilon_{it}\) (because both contain \(\varepsilon_{i,t-1}\)), so we need instruments.

Moment Conditions¶

The central moment condition for Difference GMM:

\[E[y_{i,t-s} \cdot \Delta \varepsilon_{it}] = 0 \quad \text{for } s \geq 2\]

Past values of \(y\) (at least 2 periods ago) are valid instruments because they are predetermined and uncorrelated with the differenced error.

Difference vs System GMM¶

Difference GMM (Arellano-Bond 1991):

First-differences the equation
Uses lagged levels as instruments
Fewer assumptions, more robust
Can have weak instruments when series are persistent

System GMM (Blundell-Bond 1998):

Combines difference and level equations
Uses lagged differences as instruments for the level equation
More efficient when \(\gamma > 0.8\)
Requires stationarity of initial conditions

Criterion	Difference GMM	System GMM
Persistence (\(\gamma\))	< 0.8	> 0.8
Stationarity required	No	Yes
Instrument strength	Weaker for persistent data	Stronger
Efficiency	Lower	Higher
Robustness	Higher	Lower (more assumptions)

Part 3: Hands-On Example¶

Step 1: Load and Inspect Data¶

import pandas as pd
import numpy as np
from panelbox.gmm import DifferenceGMM, SystemGMM
from panelbox.datasets import load_abdata

# Load Arellano-Bond employment dataset
data = load_abdata()

# Inspect panel structure
print(f"Number of firms (N): {data['id'].nunique()}")
print(f"Number of years (T): {data['year'].nunique()}")
print(f"Total observations: {len(data)}")

# Check for missing values
print(f"\nMissing values:\n{data[['n', 'w', 'k']].isnull().sum()}")

# Descriptive statistics
print(f"\nDescriptive statistics:")
print(data[["n", "w", "k"]].describe())

Step 2: Establish OLS/FE Bounds¶

Before running GMM, establish the credible range for the AR coefficient:

# Create lagged dependent variable
df = data.sort_values(["id", "year"]).copy()
df["n_lag"] = df.groupby("id")["n"].shift(1)
df_clean = df.dropna(subset=["n_lag"])

# --- Pooled OLS (upper bound) ---
X_ols = np.column_stack([
    np.ones(len(df_clean)),
    df_clean[["n_lag", "w", "k"]].values,
])
y_ols = df_clean["n"].values
beta_ols = np.linalg.lstsq(X_ols, y_ols, rcond=None)[0]
gamma_ols = beta_ols[1]

# --- Fixed Effects (lower bound) ---
groups = df_clean["id"].values
y_dm = df_clean["n"].values.astype(float).copy()
X_dm = df_clean[["n_lag", "w", "k"]].values.astype(float).copy()

for g in np.unique(groups):
    mask = groups == g
    y_dm[mask] -= y_dm[mask].mean()
    X_dm[mask] -= X_dm[mask].mean(axis=0)

beta_fe = np.linalg.lstsq(X_dm, y_dm, rcond=None)[0]
gamma_fe = beta_fe[0]

print(f"Credible range for persistence coefficient:")
print(f"  FE  (lower bound): {gamma_fe:.4f}")
print(f"  OLS (upper bound): {gamma_ols:.4f}")
print(f"  Range width: {gamma_ols - gamma_fe:.4f}")

if gamma_ols - gamma_fe > 0.1:
    print("\nLarge gap -- GMM is recommended")
else:
    print("\nSmall gap -- GMM may not be necessary")

Step 3: Estimate Difference GMM¶

gmm_diff = DifferenceGMM(
    data=data,
    dep_var="n",
    lags=1,
    id_var="id",
    time_var="year",
    exog_vars=["w", "k"],
    time_dummies=True,
    collapse=True,
    two_step=True,
    robust=True,
)

results_diff = gmm_diff.fit()
print(results_diff.summary())

Step 4: Interpret Coefficients¶

gamma_diff = results_diff.params["L1.n"]
se_diff = results_diff.std_errors["L1.n"]

print(f"\nDifference GMM Results:")
print(f"  Persistence (L1.n): {gamma_diff:.4f} (SE: {se_diff:.4f})")
print(f"  95% CI: [{gamma_diff - 1.96*se_diff:.4f}, {gamma_diff + 1.96*se_diff:.4f}]")

# Credibility check
in_range = gamma_fe < gamma_diff < gamma_ols
print(f"\n  Within [{gamma_fe:.4f}, {gamma_ols:.4f}]? {in_range}")

if not in_range:
    print("  WARNING: Estimate outside credible bounds")

Step 5: Run Diagnostic Tests¶

print("\n" + "=" * 70)
print("DIAGNOSTIC TESTS")
print("=" * 70)

# 1. AR(2) -- MOST CRITICAL
ar2 = results_diff.ar2_test
print(f"\n1. AR(2) test (H0: no 2nd-order autocorrelation)")
print(f"   z = {ar2.statistic:.3f}, p = {ar2.pvalue:.4f}")
if ar2.pvalue > 0.10:
    print("   PASS: Moment conditions valid")
else:
    print("   FAIL: Moment conditions rejected -- GMM invalid!")

# 2. Hansen J -- instrument validity
hansen = results_diff.hansen_j
print(f"\n2. Hansen J test (H0: instruments valid)")
print(f"   stat = {hansen.statistic:.3f}, p = {hansen.pvalue:.4f}, df = {hansen.df}")
if 0.10 < hansen.pvalue < 0.25:
    print("   PASS: Instruments appear valid (ideal range)")
elif hansen.pvalue < 0.10:
    print("   FAIL: Instruments rejected")
elif hansen.pvalue > 0.50:
    print("   WARNING: p-value very high -- possible weak instruments")
else:
    print("   PASS: Instruments acceptable")

# 3. AR(1) -- informational
ar1 = results_diff.ar1_test
print(f"\n3. AR(1) test (expected to reject)")
print(f"   z = {ar1.statistic:.3f}, p = {ar1.pvalue:.4f}")
if ar1.pvalue < 0.10:
    print("   EXPECTED: MA(1) structure from first-differencing")

# 4. Instrument count
print(f"\n4. Instrument diagnostics")
print(f"   Observations: {results_diff.nobs}")
print(f"   Groups: {results_diff.n_groups}")
print(f"   Instruments: {results_diff.n_instruments}")
print(f"   Ratio: {results_diff.instrument_ratio:.3f}")
if results_diff.instrument_ratio < 1.0:
    print("   PASS: Instrument count appropriate")
else:
    print("   WARNING: Too many instruments")

Step 6: Estimate System GMM¶

gmm_sys = SystemGMM(
    data=data,
    dep_var="n",
    lags=1,
    id_var="id",
    time_var="year",
    exog_vars=["w", "k"],
    time_dummies=True,
    collapse=True,
    two_step=True,
    robust=True,
    level_instruments={"max_lags": 1},
)

results_sys = gmm_sys.fit()
print(results_sys.summary())

gamma_sys = results_sys.params["L1.n"]
se_sys = results_sys.std_errors["L1.n"]

# Compare efficiency
efficiency_gain = (se_diff - se_sys) / se_diff * 100

print(f"\nComparison of Estimates:")
print(f"  OLS:            {gamma_ols:.4f} (upper bound)")
print(f"  System GMM:     {gamma_sys:.4f} (SE: {se_sys:.4f})")
print(f"  Difference GMM: {gamma_diff:.4f} (SE: {se_diff:.4f})")
print(f"  FE:             {gamma_fe:.4f} (lower bound)")
print(f"  Efficiency gain: {efficiency_gain:.1f}% SE reduction")

Step 7: Choose the Best Model¶

print("\n" + "=" * 70)
print("MODEL SELECTION")
print("=" * 70)

# Validity criteria
diff_valid = (
    results_diff.ar2_test.pvalue > 0.10
    and results_diff.hansen_j.pvalue > 0.10
    and results_diff.instrument_ratio < 1.0
)

sys_valid = (
    results_sys.ar2_test.pvalue > 0.10
    and results_sys.hansen_j.pvalue > 0.10
    and results_sys.instrument_ratio < 1.0
)

print(f"\nDifference GMM valid: {diff_valid}")
print(f"System GMM valid:     {sys_valid}")

if diff_valid and sys_valid:
    if se_sys < se_diff * 0.9:
        print("\nRECOMMENDATION: System GMM (more efficient and valid)")
        final_results = results_sys
    else:
        print("\nRECOMMENDATION: Difference GMM (more robust, similar efficiency)")
        final_results = results_diff
elif diff_valid:
    print("\nRECOMMENDATION: Difference GMM (System GMM fails diagnostics)")
    final_results = results_diff
elif sys_valid:
    print("\nRECOMMENDATION: System GMM (Difference GMM fails diagnostics)")
    final_results = results_sys
else:
    print("\nWARNING: Both models fail diagnostics -- revise specification")
    final_results = None

Step 8: Report Results¶

if final_results is not None:
    # Coefficient table
    coef_table = pd.DataFrame({
        "Coefficient": final_results.params,
        "Std. Error": final_results.std_errors,
        "z-stat": final_results.tvalues,
        "p-value": final_results.pvalues,
    })
    print("\nFinal Coefficient Table:")
    print(coef_table.to_string())

    # LaTeX export for papers
    latex = final_results.to_latex(
        caption="Dynamic Panel GMM Estimation",
        label="tab:gmm_results",
    )
    print("\nLaTeX output:")
    print(latex)

Part 4: Diagnostic Checklist¶

Use this checklist to validate every GMM estimation:

Essential Tests (Must Pass)¶

Test	Criterion	Why It Matters
AR(2) p > 0.10	No serial correlation	Validates moment conditions
Hansen J: 0.10 < p < 0.25	Instruments valid	Confirms instrument exogeneity
Instrument ratio < 1.0	No overfitting	Ensures test power
Coefficient in [FE, OLS]	Plausible estimate	Rules out gross misspecification

Warning Signs¶

Symptom	Likely Problem	Solution
AR(2) p < 0.05	Serial correlation in levels	Add lags: `lags=[1, 2]`
Hansen J p < 0.05	Invalid instruments	Reclassify variables, reduce instruments
Hansen J p > 0.50	Weak instruments or overfitting	Check ratio, try System GMM
Ratio > 1.0	Instrument proliferation	`collapse=True`, limit `gmm_max_lag`
Very large SEs	Weak instruments	Try System GMM
Coefficient outside bounds	Overfitting or misspecification	Simplify model
Few observations retained	Specification too complex	`time_dummies=False`

Part 5: Advanced Topics¶

Predetermined vs Endogenous Variables¶

Not all regressors are strictly exogenous. GMM handles three types:

Type	Assumption	Instrument Lags	Example
Strictly exogenous	\(E[x_{it}\varepsilon_{is}] = 0\) for all \(s, t\)	All (IV-style)	Policy shocks, weather
Predetermined	\(E[x_{it}\varepsilon_{is}] = 0\) for \(s \geq t\)	\(t-2\) and earlier	Lagged inputs
Endogenous	\(E[x_{it}\varepsilon_{it}] \neq 0\)	\(t-3\) and earlier	Contemporaneous inputs

model = DifferenceGMM(
    data=data,
    dep_var="y",
    lags=1,
    id_var="id",
    time_var="year",
    exog_vars=["policy"],           # Strictly exogenous
    predetermined_vars=["capital"],  # Predetermined
    endogenous_vars=["labor"],       # Endogenous
    collapse=True,
    two_step=True,
)

Handling Unbalanced Panels¶

Unbalanced panels require extra care:

Always use collapse=True -- Reduces instruments and handles sparsity
Consider time_dummies=False -- Many dummies with unbalanced panels can cause identification issues
Keep specifications parsimonious -- Fewer parameters means better identification
Check observation retention -- Large drops signal problems

model = DifferenceGMM(
    data=unbalanced_data,
    dep_var="y",
    lags=1,
    id_var="id",
    time_var="year",
    exog_vars=["x1", "x2"],
    time_dummies=False,  # Avoid issues with unbalanced panels
    collapse=True,       # Essential for unbalanced panels
    two_step=True,
)
results = model.fit()
retention = results.nobs / len(unbalanced_data) * 100
print(f"Observation retention: {retention:.1f}%")

Limiting Instrument Depth¶

Control which lags are used as instruments:

# Use only lags 2 and 3 (instead of all available)
model = DifferenceGMM(
    data=data,
    dep_var="y",
    lags=1,
    exog_vars=["x1"],
    collapse=True,
    gmm_max_lag=3,   # Only y_{t-2} and y_{t-3} as instruments
    two_step=True,
)

Weak Instruments¶

Symptoms: Very large standard errors, Hansen J p-value > 0.50, implausible coefficients.

Solutions:

Try System GMM -- additional instruments for persistent series
Reduce instrument count -- focus on most relevant lags
Increase sample size -- more groups or time periods
Check if GMM is truly needed -- compare OLS/FE gap

Small Sample Corrections¶

For small N or T:

Windmeijer correction -- automatically applied with two_step=True, robust=True
One-step GMM -- less efficient but more robust: gmm_type="one_step"
Conservative instruments -- always use collapse=True
Bias correction -- consider BiasCorrectedGMM for moderate N and T

Overfitting Diagnostics¶

from panelbox.gmm import GMMOverfitDiagnostic

diag = GMMOverfitDiagnostic(model, results)

# Full diagnostic report with traffic-light signals
print(diag.summary())

# Individual checks
feas = diag.assess_feasibility()
bounds = diag.coefficient_bounds_test()
step = diag.step_comparison()

See Instruments for a detailed guide to the GMMOverfitDiagnostic class.

Summary: Your GMM Workflow¶

Check if you need GMM: Lagged dependent variable + fixed effects + small T
Establish OLS/FE bounds: Credible range for the AR coefficient
Start with Difference GMM: collapse=True, two_step=True, robust=True
Run full diagnostics: AR(2), Hansen J, instrument ratio, bounds check
Try System GMM if series is persistent or Difference GMM has large SEs
Choose the best model: Valid diagnostics + smaller standard errors
Report results: Include all diagnostic tests in your paper

Tutorials¶

Tutorial	Description	Link
Difference GMM	Arellano-Bond estimator details	Difference GMM
System GMM	Blundell-Bond estimator details	System GMM
CUE-GMM	Continuous Updating Estimator	CUE-GMM
Bias-Corrected GMM	Analytical bias correction	Bias-Corrected
Instruments	Instrument selection and management	Instruments
Diagnostics	Complete diagnostic test guide	Diagnostics

References¶

Arellano, M., & Bond, S. (1991). "Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations." Review of Economic Studies, 58(2), 277-297.
Blundell, R., & Bond, S. (1998). "Initial Conditions and Moment Restrictions in Dynamic Panel Data Models." Journal of Econometrics, 87(1), 115-143.
Windmeijer, F. (2005). "A Finite Sample Correction for the Variance of Linear Efficient Two-Step GMM Estimators." Journal of Econometrics, 126(1), 25-51.
Roodman, D. (2009). "How to do xtabond2: An Introduction to Difference and System GMM in Stata." The Stata Journal, 9(1), 86-136.
Nickell, S. (1981). "Biases in Dynamic Models with Fixed Effects." Econometrica, 49(6), 1417-1426.
Bond, S. R. (2002). "Dynamic Panel Data Models: A Guide to Micro Data Methods and Practice." Portuguese Economic Journal, 1(2), 141-162.
Baltagi, B. H. (2021). Econometric Analysis of Panel Data (6^th ed.). Springer.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2^nd ed.). MIT Press.