System GMM (Blundell-Bond)¶
Quick Reference
Class: panelbox.gmm.SystemGMM
Import: from panelbox.gmm import SystemGMM
Stata equivalent: xtabond2 y L.y x1 x2, gmm(y, lag(2 .)) iv(x1 x2)
R equivalent: pgmm(y ~ lag(y, 1) + x1 + x2 | lag(y, 2:99), transformation = "ld")
Overview¶
System GMM, proposed by Blundell and Bond (1998), extends Difference GMM by combining two sets of equations in a stacked system: the first-differenced equations (with lagged levels as instruments) and the level equations (with lagged differences as instruments). This additional set of moment conditions addresses the weak instruments problem that affects Difference GMM when the dependent variable is highly persistent.
When the autoregressive coefficient \(\gamma\) approaches 1, lagged levels become poor predictors of first-differences, leading to large standard errors and imprecise estimates in Difference GMM. System GMM exploits the additional moment condition \(E[\Delta y_{i,t-1} \cdot (\alpha_i + \varepsilon_{it})] = 0\) to provide stronger instruments for the level equation, typically reducing standard errors by 20-50%.
The trade-off is an additional assumption: the stationarity of initial conditions, requiring that the initial deviations from steady state are uncorrelated with the fixed effects. This assumption is plausible when the panel data comes from an ongoing process observed well after its start.
Quick Example¶
from panelbox.gmm import SystemGMM
from panelbox.datasets import load_abdata
data = load_abdata()
model = SystemGMM(
data=data,
dep_var="n",
lags=1,
id_var="id",
time_var="year",
exog_vars=["w", "k"],
collapse=True,
two_step=True,
robust=True,
level_instruments={"max_lags": 1},
)
results = model.fit()
print(results.summary())
When to Use¶
- Highly persistent series: AR coefficient \(\gamma > 0.8\) where Difference GMM instruments are weak
- Small T, large N: Short panels where efficiency gains matter
- Stationarity is plausible: The panel does not start at a special event (e.g., firm entry, policy change)
- Difference GMM has large SEs: Standard errors from Difference GMM are much larger than expected
Key Assumptions
All Difference GMM assumptions, plus:
- Stationarity of initial conditions: \(E[\Delta y_{i,1} \cdot \alpha_i] = 0\)
- This requires the process generating \(y_{it}\) started long before the first observation
- Violated when the panel begins at firm entry, policy implementation, or other event times
Detailed Guide¶
The Weak Instruments Problem¶
When \(y_{it}\) is highly persistent (\(\gamma\) close to 1):
- \(\Delta y_{it} \approx 0\) (differences are small)
- Lagged levels \(y_{i,t-2}\) are poor predictors of \(\Delta y_{i,t-1}\)
- Instruments are weak, leading to large standard errors and biased estimates
The System GMM Solution¶
System GMM stacks two sets of equations:
1. Difference equations (same as Arellano-Bond):
Instruments: lagged levels \(y_{i,t-2}, y_{i,t-3}, \ldots\)
2. Level equations (additional):
Instruments: lagged differences \(\Delta y_{i,t-1}\)
The additional moment condition for the level equation is:
The Bounds Rule¶
A useful validation check: the true coefficient should satisfy:
- OLS overestimates \(\gamma\) (omitted variable bias from \(\alpha_i\))
- FE underestimates \(\gamma\) (Nickell bias)
- A valid GMM estimate should fall between these bounds
# Validate with bounds check
from panelbox.gmm import GMMOverfitDiagnostic
diag = GMMOverfitDiagnostic(model, results)
bounds = diag.coefficient_bounds_test()
print(f"OLS (upper): {bounds['ols_coef']:.4f}")
print(f"GMM: {bounds['gmm_coef']:.4f}")
print(f"FE (lower): {bounds['fe_coef']:.4f}")
print(f"Within bounds: {bounds['within_bounds']}")
Data Preparation¶
System GMM accepts the same data format as Difference GMM. No special preparation is needed beyond ensuring correct panel structure.
import pandas as pd
from panelbox.datasets import load_abdata
data = load_abdata()
print(f"N = {data['id'].nunique()}, T = {data['year'].nunique()}")
Estimation¶
from panelbox.gmm import SystemGMM
model = SystemGMM(
data=data,
dep_var="n",
lags=1,
id_var="id",
time_var="year",
exog_vars=["w", "k"],
collapse=True,
two_step=True,
robust=True,
level_instruments={"max_lags": 1}, # Use only first lag of differences
)
results = model.fit()
Comparing Difference vs System GMM¶
from panelbox.gmm import DifferenceGMM, SystemGMM
# Estimate both
diff_model = DifferenceGMM(
data=data, dep_var="n", lags=1, id_var="id", time_var="year",
exog_vars=["w", "k"], collapse=True, two_step=True, robust=True,
)
diff_results = diff_model.fit()
sys_model = SystemGMM(
data=data, dep_var="n", lags=1, id_var="id", time_var="year",
exog_vars=["w", "k"], collapse=True, two_step=True, robust=True,
level_instruments={"max_lags": 1},
)
sys_results = sys_model.fit()
# Compare
coef = "L1.n"
diff_se = diff_results.std_errors[coef]
sys_se = sys_results.std_errors[coef]
efficiency_gain = (diff_se - sys_se) / diff_se * 100
print(f"Difference GMM: {diff_results.params[coef]:.4f} (SE: {diff_se:.4f})")
print(f"System GMM: {sys_results.params[coef]:.4f} (SE: {sys_se:.4f})")
print(f"Efficiency gain: {efficiency_gain:.1f}% SE reduction")
Interpreting Results¶
System GMM results include the same diagnostics as Difference GMM, plus the Difference-in-Hansen test for the validity of level instruments.
# Standard diagnostics
print(f"AR(2) p-value: {results.ar2_test.pvalue:.4f}")
print(f"Hansen J p-value: {results.hansen_j.pvalue:.4f}")
print(f"Instruments: {results.n_instruments}, Ratio: {results.instrument_ratio:.3f}")
# System GMM-specific: Difference-in-Hansen test
if results.diff_hansen is not None:
print(f"Diff-in-Hansen p-value: {results.diff_hansen.pvalue:.4f}")
if results.diff_hansen.pvalue > 0.10:
print("Level instruments appear valid")
else:
print("Level instruments rejected -- use Difference GMM instead")
Configuration Options¶
System GMM inherits all parameters from Difference GMM, plus:
| Parameter | Type | Default | Description |
|---|---|---|---|
level_instruments |
dict |
{"max_lags": 1} |
Configuration for level equation instruments |
The level_instruments dictionary controls the depth of lagged differences used as instruments for the level equation:
{"max_lags": 1}-- Use only \(\Delta y_{i,t-1}\) (most conservative, recommended){"max_lags": 2}-- Use \(\Delta y_{i,t-1}\) and \(\Delta y_{i,t-2}\)- Deeper lags rarely improve efficiency
Diagnostics¶
Decision Flowchart¶
If both Difference and System GMM pass diagnostics, prefer System GMM when it has substantially smaller standard errors (\(> 10\%\) reduction).
If System GMM fails the Difference-in-Hansen test (p < 0.10), the stationarity assumption is violated. Use Difference GMM.
If Difference GMM has very large SEs but System GMM diagnostics pass, the series may be too persistent for Difference GMM. Use System GMM.
Revisit the model specification. Consider adding lags, changing exogeneity assumptions, or reducing instruments.
System GMM Diagnostic Checklist¶
| Test | Criterion | Action if Failed |
|---|---|---|
| AR(2) | p > 0.10 | Add more lags, check specification |
| Hansen J | 0.10 < p < 0.25 | If too high: reduce instruments |
| Diff-in-Hansen | p > 0.10 | If rejected: use Difference GMM |
| Instrument ratio | < 1.0 | Use collapse=True |
| Coefficient bounds | FE < GMM < OLS | Check for overfitting |
For comprehensive diagnostic guidance, see GMM Diagnostics.
Tutorials¶
| Tutorial | Description | Link |
|---|---|---|
| Complete GMM Guide | Step-by-step applied tutorial | Complete Guide |
| Instrument Selection | Managing instruments in System GMM | Instruments |
See Also¶
- Difference GMM -- Arellano-Bond Difference GMM (fewer assumptions)
- CUE-GMM -- Continuous Updating Estimator for robustness
- Bias-Corrected GMM -- Analytical bias correction
- Instruments -- Instrument selection and proliferation
- Diagnostics -- Complete diagnostic test guide
- Complete Guide -- Step-by-step applied tutorial
References¶
- Blundell, R., & Bond, S. (1998). "Initial Conditions and Moment Restrictions in Dynamic Panel Data Models." Journal of Econometrics, 87(1), 115-143.
- Arellano, M., & Bond, S. (1991). "Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations." Review of Economic Studies, 58(2), 277-297.
- Roodman, D. (2009). "How to do xtabond2: An Introduction to Difference and System GMM in Stata." The Stata Journal, 9(1), 86-136.
- Bond, S. R., Hoeffler, A., & Temple, J. (2001). "GMM Estimation of Empirical Growth Models." Economics Papers 2001-W21, Nuffield College, University of Oxford.
- Windmeijer, F. (2005). "A Finite Sample Correction for the Variance of Linear Efficient Two-Step GMM Estimators." Journal of Econometrics, 126(1), 25-51.