General Nesting Spatial Model (GNS)¶
Quick Reference
Class: panelbox.models.spatial.GeneralNestingSpatial
Import: from panelbox.models.spatial import GeneralNestingSpatial
Stata equivalent: No direct equivalent
R equivalent: spatialreg::sacsarlm() (partial)
Overview¶
The General Nesting Spatial (GNS) model is the encompassing spatial specification that nests all other static spatial models as special cases. It simultaneously includes a spatial lag of the dependent variable (\(\rho W_1 y\)), spatially lagged covariates (\(W_2 X\theta\)), and spatially autocorrelated errors (\(\lambda W_3 u\)).
The model is specified as:
where:
- \(W_1\), \(W_2\), \(W_3\) are spatial weight matrices (can be different or the same)
- \(\rho\) is the spatial lag parameter
- \(\theta\) is the vector of spatially lagged covariate effects
- \(\lambda\) is the spatial error parameter
- \(\varepsilon \sim \text{iid}(0, \sigma^2)\)
The primary use of the GNS is model selection: by estimating the full model and testing parameter restrictions, you can determine which simpler spatial specification is supported by the data.
Nested Models¶
| Restriction | Resulting Model | Description |
|---|---|---|
| \(\lambda = 0, \theta = 0\) | SAR | Spatial lag only |
| \(\rho = 0, \theta = 0\) | SEM | Spatial error only |
| \(\lambda = 0\) | SDM | Spatial Durbin |
| \(\rho = 0, \lambda = 0\) | SLX | Spatial lag of X only |
| \(\theta = 0\) | SAC/SARAR | Spatial lag + error |
| \(\rho = 0\) | SDEM | Spatial Durbin error |
| \(\rho = 0, \theta = 0, \lambda = 0\) | OLS | No spatial dependence |
| None | GNS | Full model |
Quick Example¶
import numpy as np
from panelbox.models.spatial import GeneralNestingSpatial, SpatialWeights
# Create weight matrix (use same W for all three)
W = SpatialWeights.from_contiguity(gdf, criterion='queen')
# Fit GNS model
model = GeneralNestingSpatial(
"y ~ x1 + x2", data, "region", "year",
W1=W.matrix, W2=W.matrix, W3=W.matrix
)
results = model.fit(effects='fixed', method='ml')
# View all spatial parameters
print(results.summary())
# Test restrictions to select the best nested model
# Test: is SAR adequate? (theta = 0, lambda = 0)
test_sar = model.test_restrictions(restrictions={'theta': 0, 'lambda': 0})
print(f"GNS vs SAR: LR = {test_sar['lr_statistic']:.2f}, p = {test_sar['p_value']:.4f}")
# Test: is SEM adequate? (rho = 0, theta = 0)
test_sem = model.test_restrictions(restrictions={'rho': 0, 'theta': 0})
print(f"GNS vs SEM: LR = {test_sem['lr_statistic']:.2f}, p = {test_sem['p_value']:.4f}")
# Automatic model identification
model_type = model.identify_model_type(results)
print(f"Identified model type: {model_type}") # e.g., 'SDM', 'SAR', etc.
When to Use¶
- Model selection: determine which spatial specification is supported by the data
- Robustness check: verify that your chosen model is not rejected by the data
- Exploratory analysis: when you have no strong prior about the form of spatial dependence
- Publication: report the GNS alongside your preferred model to show robustness
- Different weight matrices: when the spatial structure differs for lag, Durbin, and error terms
Key Assumptions
- Large \(N\) recommended: the GNS has many parameters and requires sufficient cross-sectional variation for identification
- Balanced panel: required for spatial structure
- Stationarity: \(|\rho| < 1\) and \(|\lambda| < 1\)
- Identification concern: with a single \(W\) for all three terms, the model may be weakly identified (Gibbons & Overman, 2012)
Detailed Guide¶
Using Different Weight Matrices¶
A key feature of the GNS is the ability to use different weight matrices for each spatial component. This is theoretically motivated when the mechanisms of spatial interaction differ:
# Example: Regional economics
# W1: contiguity (outcome spillovers via shared borders)
W1 = SpatialWeights.from_contiguity(gdf, criterion='queen').matrix
# W2: trade flows (covariate spillovers via economic linkages)
W2 = trade_weight_matrix # Custom N x N matrix
# W3: distance (error correlation via proximity)
W3 = SpatialWeights.from_distance(coords, threshold=500).matrix
model = GeneralNestingSpatial(
"y ~ x1 + x2", data, "region", "year",
W1=W1, W2=W2, W3=W3
)
results = model.fit(effects='fixed', method='ml')
Tip
Using different weight matrices improves identification. When \(W_1 = W_2 = W_3\), the GNS can suffer from weak identification — different matrices provide the variation needed to separately identify \(\rho\), \(\theta\), and \(\lambda\).
Estimation¶
The GNS is estimated by maximum likelihood with numerical optimization over \(\rho\) and \(\lambda\):
results = model.fit(
effects='fixed', # 'fixed', 'random', or 'pooled'
method='ml', # Only ML is available for GNS
rho_init=0.0, # Starting value for rho
lambda_init=0.0, # Starting value for lambda
include_wx=True, # Include W2*X terms
maxiter=1000, # Maximum iterations
optim_method='L-BFGS-B' # Optimization algorithm
)
The log-likelihood involves two spatial determinant terms:
Testing Restrictions¶
The test_restrictions() method performs Likelihood Ratio (LR) tests for nested models:
# LR test: H0: restricted model is adequate
# Under H0: LR = 2(llf_unrestricted - llf_restricted) ~ chi2(df)
# Test theta = 0 (GNS → SAC)
test_sac = model.test_restrictions(restrictions={'theta': 0})
# Test lambda = 0 (GNS → SDM)
test_sdm = model.test_restrictions(restrictions={'lambda': 0})
# Test rho = 0 (GNS → SDEM)
test_sdem = model.test_restrictions(restrictions={'rho': 0})
# Test all spatial parameters = 0 (GNS → OLS)
test_ols = model.test_restrictions(restrictions={'rho': 0, 'theta': 0, 'lambda': 0})
# Print results
for name, test in [('SAC', test_sac), ('SDM', test_sdm),
('SDEM', test_sdem), ('OLS', test_ols)]:
print(f"GNS vs {name}: LR={test['lr_statistic']:.2f}, "
f"p={test['p_value']:.4f}, {test['conclusion']}")
Automatic Model Identification¶
The identify_model_type() method classifies the estimated GNS based on which parameters are statistically significant:
model_type = model.identify_model_type(results)
print(f"Data supports: {model_type}")
# Returns one of: 'SAR', 'SEM', 'SDM', 'SAC', 'SDEM', 'SDEM-SEM', 'GNS', 'OLS'
The identification uses significance of \(\rho\), \(\theta\), and \(\lambda\) at the 5% level:
| \(\rho\) sig. | \(\theta\) sig. | \(\lambda\) sig. | Model |
|---|---|---|---|
| Yes | No | No | SAR |
| No | No | Yes | SEM |
| Yes | Yes | No | SDM |
| Yes | No | Yes | SAC |
| No | Yes | No | SLX |
| No | Yes | Yes | SDEM |
| Yes | Yes | Yes | GNS |
| No | No | No | OLS |
Configuration Options¶
Constructor Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
formula |
str |
Required | Wilkinson formula |
data |
pd.DataFrame |
Required | Panel dataset |
entity_col |
str |
Required | Entity identifier |
time_col |
str |
Required | Time identifier |
W1 |
np.ndarray |
None |
Weight matrix for spatial lag (\(Wy\)) |
W2 |
np.ndarray |
None |
Weight matrix for spatial Durbin (\(WX\)) |
W3 |
np.ndarray |
None |
Weight matrix for spatial error (\(Wu\)) |
weights |
np.ndarray |
None |
Observation weights |
fit() Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
effects |
str |
'fixed' |
'fixed', 'random', or 'pooled' |
method |
str |
'ml' |
'ml' (only available method) |
rho_init |
float |
0.0 |
Initial value for \(\rho\) |
lambda_init |
float |
0.0 |
Initial value for \(\lambda\) |
include_wx |
bool |
True |
Include \(W_2 X\) terms |
maxiter |
int |
1000 |
Maximum iterations |
optim_method |
str |
'L-BFGS-B' |
Scipy optimizer |
Caution
The GNS is a computationally expensive model. For large \(N\), estimation may take several minutes due to the need to compute two log-determinant terms (\(|I - \rho W_1|\) and \(|I - \lambda W_3|\)) at each iteration. Consider starting with simpler models (SAR, SEM, SDM) and using the GNS only for formal model selection.
Tutorials¶
| Tutorial | Description | Links |
|---|---|---|
| Spatial Econometrics | Includes GNS model selection |
See Also¶
- Spatial Lag (SAR) — Nested model with \(\theta = 0, \lambda = 0\)
- Spatial Error (SEM) — Nested model with \(\rho = 0, \theta = 0\)
- Spatial Durbin (SDM) — Nested model with \(\lambda = 0\)
- Choosing a Spatial Model — Using GNS for model selection
- Spatial Diagnostics — LR tests and model comparison
References¶
- Elhorst, J.P. (2014). Spatial Econometrics: From Cross-Sectional Data to Spatial Panels. Springer.
- LeSage, J. and Pace, R.K. (2009). Introduction to Spatial Econometrics. Chapman & Hall/CRC.
- Manski, C.F. (1993). Identification of endogenous social effects: The reflection problem. Review of Economic Studies, 60(3), 531-542.
- Gibbons, S. and Overman, H.G. (2012). Mostly pointless spatial econometrics? Journal of Regional Science, 52(2), 172-191.