Advanced FAQ¶
Technical questions for experienced users covering GMM, Panel VAR, Heckman selection models, cointegration, count models, discrete choice, and performance optimization.
New to PanelBox?
Start with the General FAQ for getting started, model selection, and results interpretation.
GMM Advanced¶
How does CUE-GMM differ from two-step GMM?
Continuous Updated Estimator (CUE-GMM) jointly optimizes both the coefficient vector and the weighting matrix, making it invariant to the choice of initial weighting matrix.
| Feature | Two-Step GMM | CUE-GMM |
|---|---|---|
| Efficiency | Good | Higher (asymptotically) |
| Computation | Fast | Slower (iterative) |
| Weak instruments | Sensitive | More robust |
| Sample requirement | N > 50 | N > 200 |
from panelbox.gmm import DifferenceGMM, ContinuousUpdatedGMM
# Two-step GMM (fast, for exploration)
gmm2 = DifferenceGMM(data, dep_var="y", lags=[1], exog_vars=["x"])
result2 = gmm2.fit(steps=2)
# CUE-GMM (more efficient but slower)
cue = ContinuousUpdatedGMM(data, dep_var="y", lags=[1], exog_vars=["x"])
result_cue = cue.fit()
Use CUE when: you need maximum efficiency, weak instruments are a concern, and computation time is acceptable. Use two-step when: quick results needed or N is small.
When should I use bias-corrected GMM?
Use BiasCorrectedGMM when standard GMM suffers from O(1/N) bias — typically with:
- Small N (N < 100)
- Persistent dependent variable (AR coefficient > 0.7)
- Short T (T < 10)
from panelbox.gmm import BiasCorrectedGMM
bc_gmm = BiasCorrectedGMM(data, dep_var="y", lags=[1])
result = bc_gmm.fit(order=2) # Second-order correction
print(f"Bias magnitude: {result.bias_magnitude()}")
# If > 10% of coefficient, correction is important
Minimum sample sizes: first-order correction needs N >= 50, second-order needs N >= 100.
My GMM has convergence issues. What should I do?
Troubleshooting steps:
-
Use two-step starting values:
-
Reduce instrument count — rule of thumb: instruments <= 2 * parameters:
-
Check for multicollinearity and data quality (no NaN, outliers)
-
Fall back to two-step GMM if CUE doesn't converge
How do I handle instrument proliferation?
Instrument proliferation occurs when the number of instruments exceeds the number of entities, causing overfitting and unreliable Hansen J tests.
Rule of thumb: instruments < N (number of entities)
Solutions:
- Use
collapse=Trueto reduce instruments - Limit lag depth with
max_lags=2 - Check the
GMMOverfitDiagnosticafter estimation:
Hansen J test says everything is fine but results look wrong?
This is a classic symptom of overfitting. When you have too many instruments, the Hansen J test loses power and fails to reject even when instruments are invalid.
Red flags:
- Hansen J p-value very close to 1.0
- Number of instruments >> N
- Results change dramatically when reducing instruments
Solution: re-estimate with collapse=True and fewer lags. A well-behaved Hansen J p-value is typically between 0.10 and 0.80.
Panel VAR¶
When should I use VAR vs VECM?
| Condition | Model |
|---|---|
| Variables are stationary (I(0)) | Panel VAR |
| Variables are non-stationary (I(1)) and cointegrated | Panel VECM |
| Variables are I(1) and not cointegrated | VAR in first differences |
Recommended workflow:
from panelbox.validation import IPSTest
# 1. Test for unit roots
for var in ["y1", "y2"]:
test = IPSTest(data, variable=var, entity_col="entity", time_col="time")
result = test.run()
print(f"{var}: {'stationary' if result.pvalue < 0.05 else 'unit root'}")
# 2. If all I(1), test cointegration
from panelbox.diagnostics.cointegration import pedroni_test
coint = pedroni_test(data, dependent="y1", covariates=["y2"])
# 3. If cointegrated → VECM
from panelbox.var import PanelVECM
vecm = PanelVECM(data, ...)
How do I select the optimal lag order?
Use information criteria:
from panelbox.var import PanelVAR
pvar = PanelVAR(data, variables=["y1", "y2"], entity_col="entity", time_col="time")
lag_result = pvar.select_lag_order(max_lags=5, criterion="bic")
print(f"Optimal lags: {lag_result.optimal_lag}")
Criteria:
- BIC: more conservative, penalizes extra parameters more — good default
- AIC: less conservative, may overfit
- HQIC: intermediate between AIC and BIC
Practical guidelines: annual data → 1-2 lags; quarterly → 1-4; monthly → 1-12.
How do I interpret Granger causality in Panel VAR?
Granger causality is predictive, not structural. "\(X\) Granger-causes \(Y\)" means past values of \(X\) improve the prediction of \(Y\), controlling for past \(Y\).
result = pvar.fit(lags=2)
gc = result.granger_causality("x1", "x2")
print(f"p-value: {gc.pvalue:.4f}")
# p < 0.05: x1 Granger-causes x2
Important: Granger causality does NOT imply true causation. It is a statement about prediction, not about causal mechanisms.
For panel data specifically, the Dumitrescu-Hurlin test has more power and allows for heterogeneity across entities.
IRF confidence bands — bootstrap vs asymptotic?
| Method | Pros | Cons |
|---|---|---|
| Bootstrap | More reliable in small samples; distribution-free | Slow (especially with many replications) |
| Asymptotic | Fast | May be inaccurate in small samples |
# Bootstrap (recommended for final results)
irf = result.irf(periods=10, ci_method="bootstrap", n_boot=1000)
# Asymptotic (fast, for exploration)
irf = result.irf(periods=10, ci_method="analytical")
If bootstrap IRFs look asymmetric or strange, increase n_boot to 2000 or try bootstrap_type="residual".
My Panel VAR is unstable (eigenvalues > 1). What should I do?
An unstable VAR means IRFs will diverge over time. Common causes:
- Variables are non-stationary — test with unit root tests, difference if needed
- Too many lags — try reducing the lag order
- Cointegrated variables — use VECM instead of VAR
OLS vs GMM for Panel VAR — which should I use?
| Method | Use when | Pros | Cons |
|---|---|---|---|
| OLS | T >> N, exogeneity holds | Fast, simple | Nickell bias when T small |
| GMM | T small (~10-20), N large | Consistent, handles endogeneity | Needs N large, sensitive to instruments |
Start with OLS for exploration, then switch to GMM if diagnostics indicate problems (endogeneity, small T).
Panel Heckman / Selection Models¶
When should I use the Heckman selection model?
Use PanelHeckman when your sample is non-randomly selected:
- Wages observed only for workers (not unemployed)
- Firm performance observed only for surviving firms
- Exports observed only for exporting firms
Requirements:
- You can model the selection decision
- An exclusion restriction exists (variable affects selection but not outcome)
What are exclusion restrictions and why do they matter?
An exclusion restriction is a variable that:
- Affects whether you observe the outcome (selection equation)
- Does not directly affect the outcome itself
Example: "number of young children" affects labor force participation (selection) but should not directly affect wages (outcome), after controlling for experience and education.
Without a valid exclusion restriction, the Heckman model is identified only through functional form assumptions, making estimates fragile.
Two-step vs MLE for Heckman — which should I use?
| Feature | Two-Step | MLE |
|---|---|---|
| Speed | Fast | Slow |
| Efficiency | Good | Best (asymptotically) |
| Robustness | More robust to misspecification | Sensitive |
| Convergence | Always works | May fail |
| Sample size | Works with N < 100 | Better with N > 200 |
Recommendation: Start with two-step. Use MLE for final results only if N > 200 and two-step shows significant selection.
How do I test for selection bias?
from panelbox.models.selection import test_selection_effect
result = heckman.fit(method="two_step")
# Test H0: no selection bias (rho = 0)
sel_test = test_selection_effect(result)
print(f"rho = {result.rho:.3f}, p-value = {sel_test.pvalue:.4f}")
# If p < 0.05: selection bias exists, Heckman correction is needed
# If p >= 0.05: no evidence of selection bias, OLS may be sufficient
What does rho > 1 mean in the Heckman model?
If \(|\rho| > 1\), the model is misspecified. Possible causes:
- Wrong exclusion restriction — the instrument affects the outcome directly
- Functional form misspecification — add non-linear terms (e.g.,
exper^2) - Omitted variables in the selection equation
Solution: Use method="two_step" (more robust) and reconsider your exclusion restriction.
Cointegration & Unit Root Tests¶
Which cointegration test should I use: Westerlund, Pedroni, or Kao?
| Feature | Westerlund | Pedroni | Kao |
|---|---|---|---|
| Heterogeneity | Allows | Allows | Assumes homogeneous |
| Power (small T) | High | Medium | Medium |
| Interpretation | ECM-based | Residual-based | Residual-based |
| Cross-section dependence | Robust | Sensitive | Sensitive |
| Speed | Slow | Fast | Fast |
Recommendation: Use Westerlund for most cases. Use Pedroni if Westerlund is too slow. Use Kao only if you believe in homogeneous cointegrating vectors.
from panelbox.diagnostics.cointegration import westerlund_test, pedroni_test, kao_test
# Westerlund (recommended)
west = westerlund_test(data, dependent="y", covariates=["x1", "x2"])
print(west.summary())
# Pedroni (faster)
ped = pedroni_test(data, dependent="y", covariates=["x1", "x2"])
# Kao (homogeneous)
kao = kao_test(data, dependent="y", covariates=["x1", "x2"])
All my unit root tests disagree. How do I interpret this?
Different tests have different null hypotheses and power properties:
| Test | H0 | Good for |
|---|---|---|
| Hadri | Stationarity | Confirming unit roots |
| IPS | Unit root | Detecting stationarity |
| LLC | Unit root (homogeneous) | Homogeneous panels |
| Breitung | Unit root | Robust to heterogeneous trends |
| Fisher | Unit root | Unbalanced panels |
Interpretation guide:
- Hadri rejects + IPS fails to reject → Unit root present
- Hadri fails to reject + IPS rejects → Stationary
- Both reject → Borderline (near unit root); use Breitung as tiebreaker
- Both fail to reject → Low power; increase sample or check for structural breaks
Best practice: Run a battery of tests and look for consensus.
Which Pedroni statistics should I report?
Pedroni offers 7 test statistics, divided into:
- Panel statistics (within-dimension): assume common AR dynamics — more powerful under homogeneity
- Group statistics (between-dimension): allow heterogeneous AR dynamics — more robust
Recommendation: Report the group ADF statistic (most commonly used) and the panel ADF statistic for robustness. If they agree, the result is reliable.
How do I set the trend specification for unit root tests?
| Specification | Code | Use when |
|---|---|---|
| No constant, no trend | "nc" |
Rarely used |
| Constant only | "c" |
Default — most common |
| Constant + trend | "ct" |
Series has deterministic trend |
Rule: If the series visually trends upward/downward, use "ct". Otherwise use "c".
PPML / Count Models¶
When does PPML fail?
PPML can fail due to:
- Separation: a regressor perfectly predicts zero outcomes
- Perfect prediction: fitted values are exactly zero for some observations
- Extreme outliers in the dependent variable
Troubleshooting:
from panelbox.models.count import PPML, PoissonQML
ppml = PPML(data, dep_var="trade", exog_vars=[...])
try:
result = ppml.fit(maxiter=500)
except Exception:
# Fall back to Poisson QML (more robust)
qml = PoissonQML(data, dep_var="trade", exog_vars=[...])
result = qml.fit()
Also try scaling variables (divide large values by 1e6) and removing extreme outliers.
How do I test for overdispersion (Poisson vs Negative Binomial)?
If the variance exceeds the mean, the data is overdispersed and Negative Binomial may be more appropriate:
# Quick check
mean_y = data["y"].mean()
var_y = data["y"].var()
print(f"Variance/Mean ratio: {var_y / mean_y:.2f}")
# If > 1.5: overdispersed, consider Negative Binomial
For a formal test, estimate both models and compare:
Zero-inflated vs standard count models — when to use which?
| Model | Use when |
|---|---|
| Standard Poisson/NB | Zeros arise from the same process as positive counts |
| Zero-Inflated | Zeros come from two processes: "structural zeros" (never events) + sampling zeros |
Example: patent counts — some firms never innovate (structural zeros) while others innovate but happen to have zero patents in a given year (sampling zeros).
Multinomial / Discrete Choice¶
IIA assumption — how do I test it?
The Independence of Irrelevant Alternatives (IIA) assumption in multinomial logit can be tested using:
- Hausman-McFadden test: estimate model with and without one alternative
- Small-Hsiao test: split sample and compare estimates
If IIA is rejected, consider:
- Nested logit (groups correlated alternatives)
- Mixed logit (random coefficients)
- Conditional logit with alternative-specific variables
Fixed Effects logit drops groups — why?
FE logit uses conditional maximum likelihood, which requires within-group variation in the dependent variable. Groups where the outcome never changes (always 0 or always 1) provide no information and are dropped.
This is expected behavior, not an error. If many groups are dropped, consider:
- Using Random Effects logit instead
- Checking if your sample has sufficient variation
- Using a longer time span
Performance Tips¶
My estimation is slow. How can I speed it up?
General tips:
| Bottleneck | Solution |
|---|---|
| Too many GMM instruments | Use collapse=True, limit max_lags |
| Two-step → CUE-GMM | Use two-step for exploration, CUE for final results only |
| Bootstrap | Reduce n_boot (499 is often sufficient), or use asymptotic |
| Large N spatial model | Use sparse weight matrices |
| Heckman MLE | Reduce quadrature_points to 10 (from default 15) |
| FE multinomial logit | Use RE if J > 4 or T > 10 |
Workflow recommendation: Use fast methods (two-step, asymptotic, OLS) for exploration. Switch to robust methods (CUE, bootstrap, GMM) for final results.
Memory issues with large panels. What can I do?
- Reduce bootstrap replications:
n_boot=499instead of 1999 - Use sparse weight matrices for spatial models:
W.to_sparse() - Process by variable instead of all at once (e.g., IRFs one impulse at a time)
- Subset your data for initial exploration, then run on full data for final results
Heckman MLE is very slow. Are there alternatives?
-
Reduce quadrature points (biggest speedup):
-
Use two-step as starting values:
-
Just use two-step — efficiency loss is small for most applications:
Approximate timings (N=1000, T=5): two-step ~5s, MLE (10 pts) ~30s, MLE (15 pts) ~60s.
See Also¶
- General FAQ — getting started, model selection, results interpretation
- Spatial FAQ — spatial econometrics questions
- Troubleshooting — error messages and debugging
- GMM API Reference
- VAR API Reference
- Cointegration Theory