General Nesting Spatial Model (GNS)¶

Quick Reference

Class: panelbox.models.spatial.GeneralNestingSpatial Import: from panelbox.models.spatial import GeneralNestingSpatial Stata equivalent: No direct equivalent R equivalent: spatialreg::sacsarlm() (partial)

Overview¶

The General Nesting Spatial (GNS) model is the encompassing spatial specification that nests all other static spatial models as special cases. It simultaneously includes a spatial lag of the dependent variable (\(\rho W_1 y\)), spatially lagged covariates (\(W_2 X\theta\)), and spatially autocorrelated errors (\(\lambda W_3 u\)).

The model is specified as:

\[y = \rho W_1 y + X\beta + W_2 X\theta + u, \quad u = \lambda W_3 u + \varepsilon\]

where:

\(W_1\), \(W_2\), \(W_3\) are spatial weight matrices (can be different or the same)
\(\rho\) is the spatial lag parameter
\(\theta\) is the vector of spatially lagged covariate effects
\(\lambda\) is the spatial error parameter
\(\varepsilon \sim \text{iid}(0, \sigma^2)\)

The primary use of the GNS is model selection: by estimating the full model and testing parameter restrictions, you can determine which simpler spatial specification is supported by the data.

Nested Models¶

Restriction	Resulting Model	Description
\(\lambda = 0, \theta = 0\)	SAR	Spatial lag only
\(\rho = 0, \theta = 0\)	SEM	Spatial error only
\(\lambda = 0\)	SDM	Spatial Durbin
\(\rho = 0, \lambda = 0\)	SLX	Spatial lag of X only
\(\theta = 0\)	SAC/SARAR	Spatial lag + error
\(\rho = 0\)	SDEM	Spatial Durbin error
\(\rho = 0, \theta = 0, \lambda = 0\)	OLS	No spatial dependence
None	GNS	Full model

Quick Example¶

import numpy as np
from panelbox.models.spatial import GeneralNestingSpatial, SpatialWeights

# Create weight matrix (use same W for all three)
W = SpatialWeights.from_contiguity(gdf, criterion='queen')

# Fit GNS model
model = GeneralNestingSpatial(
    "y ~ x1 + x2", data, "region", "year",
    W1=W.matrix, W2=W.matrix, W3=W.matrix
)
results = model.fit(effects='fixed', method='ml')

# View all spatial parameters
print(results.summary())

# Test restrictions to select the best nested model
# Test: is SAR adequate? (theta = 0, lambda = 0)
test_sar = model.test_restrictions(restrictions={'theta': 0, 'lambda': 0})
print(f"GNS vs SAR: LR = {test_sar['lr_statistic']:.2f}, p = {test_sar['p_value']:.4f}")

# Test: is SEM adequate? (rho = 0, theta = 0)
test_sem = model.test_restrictions(restrictions={'rho': 0, 'theta': 0})
print(f"GNS vs SEM: LR = {test_sem['lr_statistic']:.2f}, p = {test_sem['p_value']:.4f}")

# Automatic model identification
model_type = model.identify_model_type(results)
print(f"Identified model type: {model_type}")  # e.g., 'SDM', 'SAR', etc.

When to Use¶

Model selection: determine which spatial specification is supported by the data
Robustness check: verify that your chosen model is not rejected by the data
Exploratory analysis: when you have no strong prior about the form of spatial dependence
Publication: report the GNS alongside your preferred model to show robustness
Different weight matrices: when the spatial structure differs for lag, Durbin, and error terms

Key Assumptions

Large \(N\) recommended: the GNS has many parameters and requires sufficient cross-sectional variation for identification
Balanced panel: required for spatial structure
Stationarity: \(|\rho| < 1\) and \(|\lambda| < 1\)
Identification concern: with a single \(W\) for all three terms, the model may be weakly identified (Gibbons & Overman, 2012)

Detailed Guide¶

Using Different Weight Matrices¶

A key feature of the GNS is the ability to use different weight matrices for each spatial component. This is theoretically motivated when the mechanisms of spatial interaction differ:

# Example: Regional economics
# W1: contiguity (outcome spillovers via shared borders)
W1 = SpatialWeights.from_contiguity(gdf, criterion='queen').matrix

# W2: trade flows (covariate spillovers via economic linkages)
W2 = trade_weight_matrix  # Custom N x N matrix

# W3: distance (error correlation via proximity)
W3 = SpatialWeights.from_distance(coords, threshold=500).matrix

model = GeneralNestingSpatial(
    "y ~ x1 + x2", data, "region", "year",
    W1=W1, W2=W2, W3=W3
)
results = model.fit(effects='fixed', method='ml')

Tip

Using different weight matrices improves identification. When \(W_1 = W_2 = W_3\), the GNS can suffer from weak identification — different matrices provide the variation needed to separately identify \(\rho\), \(\theta\), and \(\lambda\).

Estimation¶

The GNS is estimated by maximum likelihood with numerical optimization over \(\rho\) and \(\lambda\):

results = model.fit(
    effects='fixed',       # 'fixed', 'random', or 'pooled'
    method='ml',           # Only ML is available for GNS
    rho_init=0.0,          # Starting value for rho
    lambda_init=0.0,       # Starting value for lambda
    include_wx=True,       # Include W2*X terms
    maxiter=1000,          # Maximum iterations
    optim_method='L-BFGS-B'  # Optimization algorithm
)

The log-likelihood involves two spatial determinant terms:

\[\ell = -\frac{NT}{2}\ln(2\pi\sigma^2) + T\ln|I - \rho W_1| + T\ln|I - \lambda W_3| - \frac{1}{2\sigma^2}\varepsilon'\varepsilon\]

Testing Restrictions¶

The test_restrictions() method performs Likelihood Ratio (LR) tests for nested models:

# LR test: H0: restricted model is adequate
# Under H0: LR = 2(llf_unrestricted - llf_restricted) ~ chi2(df)

# Test theta = 0 (GNS → SAC)
test_sac = model.test_restrictions(restrictions={'theta': 0})

# Test lambda = 0 (GNS → SDM)
test_sdm = model.test_restrictions(restrictions={'lambda': 0})

# Test rho = 0 (GNS → SDEM)
test_sdem = model.test_restrictions(restrictions={'rho': 0})

# Test all spatial parameters = 0 (GNS → OLS)
test_ols = model.test_restrictions(restrictions={'rho': 0, 'theta': 0, 'lambda': 0})

# Print results
for name, test in [('SAC', test_sac), ('SDM', test_sdm),
                    ('SDEM', test_sdem), ('OLS', test_ols)]:
    print(f"GNS vs {name}: LR={test['lr_statistic']:.2f}, "
          f"p={test['p_value']:.4f}, {test['conclusion']}")

Automatic Model Identification¶

The identify_model_type() method classifies the estimated GNS based on which parameters are statistically significant:

model_type = model.identify_model_type(results)
print(f"Data supports: {model_type}")
# Returns one of: 'SAR', 'SEM', 'SDM', 'SAC', 'SDEM', 'SDEM-SEM', 'GNS', 'OLS'

The identification uses significance of \(\rho\), \(\theta\), and \(\lambda\) at the 5% level:

\(\rho\) sig.	\(\theta\) sig.	\(\lambda\) sig.	Model
Yes	No	No	SAR
No	No	Yes	SEM
Yes	Yes	No	SDM
Yes	No	Yes	SAC
No	Yes	No	SLX
No	Yes	Yes	SDEM
Yes	Yes	Yes	GNS
No	No	No	OLS

Configuration Options¶

Constructor Parameters¶

Parameter	Type	Default	Description
`formula`	`str`	Required	Wilkinson formula
`data`	`pd.DataFrame`	Required	Panel dataset
`entity_col`	`str`	Required	Entity identifier
`time_col`	`str`	Required	Time identifier
`W1`	`np.ndarray`	`None`	Weight matrix for spatial lag (\(Wy\))
`W2`	`np.ndarray`	`None`	Weight matrix for spatial Durbin (\(WX\))
`W3`	`np.ndarray`	`None`	Weight matrix for spatial error (\(Wu\))
`weights`	`np.ndarray`	`None`	Observation weights

`fit()` Parameters¶

Parameter	Type	Default	Description
`effects`	`str`	`'fixed'`	`'fixed'`, `'random'`, or `'pooled'`
`method`	`str`	`'ml'`	`'ml'` (only available method)
`rho_init`	`float`	`0.0`	Initial value for \(\rho\)
`lambda_init`	`float`	`0.0`	Initial value for \(\lambda\)
`include_wx`	`bool`	`True`	Include \(W_2 X\) terms
`maxiter`	`int`	`1000`	Maximum iterations
`optim_method`	`str`	`'L-BFGS-B'`	Scipy optimizer

Caution

The GNS is a computationally expensive model. For large \(N\), estimation may take several minutes due to the need to compute two log-determinant terms (\(|I - \rho W_1|\) and \(|I - \lambda W_3|\)) at each iteration. Consider starting with simpler models (SAR, SEM, SDM) and using the GNS only for formal model selection.

Tutorials¶

Tutorial	Description	Links
Spatial Econometrics	Includes GNS model selection

References¶

Elhorst, J.P. (2014). Spatial Econometrics: From Cross-Sectional Data to Spatial Panels. Springer.
LeSage, J. and Pace, R.K. (2009). Introduction to Spatial Econometrics. Chapman & Hall/CRC.
Manski, C.F. (1993). Identification of endogenous social effects: The reflection problem. Review of Economic Studies, 60(3), 531-542.
Gibbons, S. and Overman, H.G. (2012). Mostly pointless spatial econometrics? Journal of Regional Science, 52(2), 172-191.