Spatial Econometrics Theory --- Modeling Geographic Dependence¶

Key Takeaway

When observations are spatially dependent, standard estimators produce biased or inefficient results. Spatial econometric models explicitly incorporate geographic relationships through a spatial weight matrix $W$, capturing spillovers between entities. The choice between spatial lag (SAR), spatial error (SEM), and spatial Durbin (SDM) models depends on the nature of the dependence.

Motivation¶

Tobler's First Law of Geography states: "Everything is related to everything else, but near things are more related than distant things." In economic data, this manifests as:

Housing prices in neighboring areas are correlated
Unemployment rates cluster geographically
Technology adoption spreads through proximity
Policy decisions of one region affect neighbors

Ignoring spatial dependence leads to:

Biased estimates (if dependence is in the dependent variable)
Inefficient estimates and invalid inference (if dependence is in the errors)
Misleading standard errors and hypothesis tests

The Spatial Weight Matrix¶

All spatial models require a spatial weight matrix $W$ that encodes the relationships between entities:

\[ W = \begin{pmatrix} 0 & w_{12} & w_{13} & \cdots & w_{1N} \\ w_{21} & 0 & w_{23} & \cdots & w_{2N} \\ \vdots & & \ddots & & \vdots \\ w_{N1} & w_{N2} & \cdots & w_{N,N-1} & 0 \end{pmatrix} \]

The diagonal is zero by convention (an entity is not its own neighbor).

Types of Weight Matrices¶

Type	Definition	Use Case
Queen contiguity	$w_{ij} = 1$ if $i$ and $j$ share a boundary or vertex	Administrative regions
Rook contiguity	$w_{ij} = 1$ if $i$ and $j$ share an edge	Regular grids
Distance-based	$w_{ij} = 1$ if $d_{ij} < \bar{d}$, else 0	Point data
Inverse distance	$w_{ij} = d_{ij}^{-\alpha}$	Gravity-type models
K-nearest neighbors	$w_{ij} = 1$ if $j$ is among $k$ nearest neighbors of $i$	Irregular layouts

Row Standardization¶

Most applications use row-standardized weights:

\[ w_{ij}^* = \frac{w_{ij}}{\sum_{j=1}^N w_{ij}} \]

This ensures $Wy_t$ is a weighted average of neighbors' values and the spatial parameter has a natural interpretation on $(-1, 1)$.

Spatial Lag Model (SAR)¶

Specification¶

\[ y_{it} = \rho \sum_{j=1}^N w_{ij} y_{jt} + X_{it}'\beta + \alpha_i + \varepsilon_{it} \]

or in matrix form: $y_t = \rho W y_t + X_t \beta + \alpha + \varepsilon_t$.

Key Features¶

The term $Wy$ is endogenous --- $y_j$ depends on $y_i$ and vice versa
OLS is inconsistent; requires ML or IV estimation
Captures global spillovers with multiplier effects

Spatial Multiplier¶

Solving for $y$:

\[ y_t = (I_N - \rho W)^{-1}(X_t\beta + \alpha + \varepsilon_t) \]

The matrix $(I_N - \rho W)^{-1}$ is the spatial multiplier, which propagates shocks through the spatial network. A change in $x_j$ affects not just entity $j$ but all connected entities through higher-order feedback.

Parameter Interpretation¶

$\rho > 0$: Positive spatial dependence (clustering of similar values)
$\rho < 0$: Negative spatial dependence (dissimilar neighbors)
$|\rho| < 1$ required for stability (with row-standardized $W$)

Spatial Error Model (SEM)¶

Specification¶

\[ y_{it} = X_{it}'\beta + \alpha_i + u_{it} $$ $$ u_{it} = \lambda \sum_{j=1}^N w_{ij} u_{jt} + \varepsilon_{it} \]

Spatial dependence is in the error term, not the dependent variable.

Key Features¶

OLS is unbiased but inefficient (standard errors are wrong)
Reflects omitted spatially correlated variables
No multiplier effects --- a shock to entity $j$'s covariates affects only $j$'s outcome
$\beta$ coefficients have the standard interpretation: $\partial y_i / \partial x_{ik} = \beta_k$

When to Use¶

SEM is appropriate when spatial dependence arises from:

Common unobserved shocks (weather, macro conditions)
Measurement error with spatial patterns
Omitted spatially correlated variables that do not directly cause $y$

Spatial Durbin Model (SDM)¶

Specification¶

\[ y_{it} = \rho \sum_{j=1}^N w_{ij} y_{jt} + X_{it}'\beta + \sum_{j=1}^N w_{ij} X_{jt}'\theta + \alpha_i + \varepsilon_{it} \]

The SDM includes spatial lags of both the dependent variable and the covariates.

Generality¶

The SDM is a general specification that nests both SAR and SEM:

SAR: $\theta = 0$
SEM: $\theta + \rho\beta = 0$ (common factor restriction)

LeSage and Pace (2009) recommend the SDM as the default starting point for spatial analysis, testing restrictions to simpler models afterward.

Direct and Indirect Effects¶

In spatial models with endogenous spatial lags, coefficient interpretation requires computing partial derivatives:

\[ \frac{\partial y}{\partial x_k'} = (I_N - \rho W)^{-1}(I_N \beta_k + W\theta_k) \]

Effect	Definition	Interpretation
Direct	Average diagonal of $(I - \rho W)^{-1}(I\beta_k + W\theta_k)$	Effect of $x_{ik}$ on $y_i$ (includes feedback)
Indirect	Average off-diagonal row sum	Effect of $x_{jk}$ on $y_i$ for $j \neq i$ (spillover)
Total	Direct + Indirect	Complete effect including all spatial channels

Coefficient $\neq$ Marginal Effect

In SAR and SDM models, the regression coefficient $\beta_k$ is not the marginal effect of $x_k$ on $y$. Always compute and report direct, indirect, and total effects.

General Nesting Spatial Model (GNS)¶

Specification¶

\[ y_{it} = \rho \sum_{j=1}^N w_{ij} y_{jt} + X_{it}'\beta + \sum_{j=1}^N w_{ij} X_{jt}'\theta + u_{it} $$ $$ u_{it} = \lambda \sum_{j=1}^N w_{ij} u_{jt} + \varepsilon_{it} \]

The GNS nests all other spatial models ($\rho$, $\theta$, and $\lambda$ all present). It is the most flexible but also the most complex specification.

Dynamic Spatial Panels¶

When both spatial and temporal dependence are present:

\[ y_{it} = \tau y_{i,t-1} + \rho \sum_j w_{ij} y_{jt} + \phi \sum_j w_{ij} y_{j,t-1} + X_{it}'\beta + \alpha_i + \varepsilon_{it} \]

This model captures:

Temporal persistence ($\tau$): own past affects current outcome
Contemporaneous spatial dependence ($\rho$): neighbors' current outcomes matter
Space-time diffusion ($\phi$): neighbors' past outcomes matter

Estimation Methods¶

Maximum Likelihood (ML)¶

The log-likelihood for SAR with panel FE:

\[ \ell = -\frac{NT}{2}\ln(2\pi\sigma^2) + T\ln|I_N - \rho W| - \frac{1}{2\sigma^2}\sum_{i,t}(y_{it} - \rho Wy_{it} - X_{it}'\beta - \alpha_i)^2 \]

The term $\ln|I_N - \rho W|$ (log-determinant) is the main computational challenge. Efficient computation uses eigenvalues of $W$: $\ln|I - \rho W| = \sum_{i=1}^N \ln(1 - \rho \omega_i)$.

Best for small to medium panels ($N < 1000$)
Efficient under correct specification

Generalized Method of Moments¶

For large panels, GMM avoids computing the log-determinant:

Uses instruments $W^2X, W^3X, \ldots$ for the endogenous $Wy$
Best for large panels ($N > 1000$)
Less efficient than ML but computationally cheaper

Quasi-Maximum Likelihood (QML)¶

Robust to non-normality of $\varepsilon_{it}$
Consistent under weaker distributional assumptions
Requires the same log-determinant computation as ML

Testing for Spatial Autocorrelation¶

Moran's I¶

The most common global test for spatial autocorrelation:

\[ I = \frac{N}{S_0} \frac{e'We}{e'e} \]

where $e$ are residuals and $S_0 = \sum_i \sum_j w_{ij}$. The standardized statistic $Z_I = (I - E[I])/\sqrt{\text{Var}[I]}$ is asymptotically $N(0,1)$ under the null of no spatial autocorrelation.

$I > E[I]$: Positive spatial autocorrelation (clustering)
$I < E[I]$: Negative spatial autocorrelation (dispersion)

Local Indicators of Spatial Association (LISA)¶

Local Moran's $I_i$ identifies local clusters and outliers:

\[ I_i = \frac{(y_i - \bar{y})}{\sigma^2} \sum_j w_{ij}(y_j - \bar{y}) \]

Cluster Type	Meaning
High-High (HH)	Hot spots --- high values surrounded by high values
Low-Low (LL)	Cold spots --- low values surrounded by low values
High-Low (HL)	Spatial outlier --- high value among low neighbors
Low-High (LH)	Spatial outlier --- low value among high neighbors

LM Tests for Model Selection¶

Lagrange Multiplier tests on OLS residuals guide model choice:

LM-Lag test ($H_0: \rho = 0$):

\[ LM_\rho = \frac{(e'Wy / \hat{\sigma}^2)^2}{T_\rho} \]

LM-Error test ($H_0: \lambda = 0$):

\[ LM_\lambda = \frac{(e'We / \hat{\sigma}^2)^2}{\text{tr}(W'W + W^2)} \]

Decision rule:

Only LM-Lag significant: use SAR
Only LM-Error significant: use SEM
Both significant: check robust versions
- Robust LM-Lag significant: SAR
- Robust LM-Error significant: SEM
- Both robust significant: SDM or GNS

Model Comparison¶

Feature	SAR	SEM	SDM	GNS
Spatial lag of $y$	Yes	No	Yes	Yes
Spatial lag of $X$	No	No	Yes	Yes
Spatial error	No	Yes	No	Yes
Global spillovers	Yes	No	Yes	Yes
Parameters	$K+2$	$K+2$	$2K+2$	$2K+3$
Interpretation	Simple	Simple	Complex	Complex

Practical Implications¶

Always test first: Run Moran's I on OLS residuals before fitting a spatial model
Use LM tests to guide model selection, but also consider economic theory
SDM as default: When unsure, start with SDM (nests SAR and SEM)
Report effects: For SAR and SDM, always compute direct, indirect, and total effects
Sensitivity to $W$: Test robustness to different weight matrix specifications
Row-standardize $W$ for interpretable spatial parameters

Key References¶

Anselin, L. (1988). Spatial Econometrics: Methods and Models. Kluwer Academic Publishers. --- Foundational textbook for spatial econometrics.
LeSage, J. & Pace, R.K. (2009). Introduction to Spatial Econometrics. CRC Press. --- Modern treatment with direct/indirect effects decomposition.
Elhorst, J.P. (2014). Spatial Econometrics: From Cross-Sectional Data to Spatial Panels. Springer. --- Comprehensive guide to spatial panel models.
Lee, L.F. & Yu, J. (2010). "Estimation of Spatial Autoregressive Panel Data Models with Fixed Effects." Journal of Econometrics, 154(2), 165--185. --- ML estimation of spatial panel models.
Anselin, L. (1995). "Local Indicators of Spatial Association --- LISA." Geographical Analysis, 27(2), 93--115. --- Local spatial autocorrelation measures.
Moran, P.A.P. (1950). "Notes on Continuous Stochastic Phenomena." Biometrika, 37, 17--23. --- The original Moran's I statistic.

Type	Definition	Use Case
Queen contiguity	\(w_{ij} = 1\) if \(i\) and \(j\) share a boundary or vertex	Administrative regions
Rook contiguity	\(w_{ij} = 1\) if \(i\) and \(j\) share an edge	Regular grids
Distance-based	\(w_{ij} = 1\) if \(d_{ij} < \bar{d}\), else 0	Point data
Inverse distance	\(w_{ij} = d_{ij}^{-\alpha}\)	Gravity-type models
K-nearest neighbors	\(w_{ij} = 1\) if \(j\) is among \(k\) nearest neighbors of \(i\)	Irregular layouts

Effect	Definition	Interpretation
Direct	Average diagonal of \((I - \rho W)^{-1}(I\beta_k + W\theta_k)\)	Effect of \(x_{ik}\) on \(y_i\) (includes feedback)
Indirect	Average off-diagonal row sum	Effect of \(x_{jk}\) on \(y_i\) for \(j \neq i\) (spillover)
Total	Direct + Indirect	Complete effect including all spatial channels

Feature	SAR	SEM	SDM	GNS
Spatial lag of \(y\)	Yes	No	Yes	Yes
Spatial lag of \(X\)	No	No	Yes	Yes
Spatial error	No	Yes	No	Yes
Global spillovers	Yes	No	Yes	Yes
Parameters	\(K+2\)	\(K+2\)	\(2K+2\)	\(2K+3\)
Interpretation	Simple	Simple	Complex	Complex

Spatial Econometrics Theory --- Modeling Geographic Dependence¶

Motivation¶

The Spatial Weight Matrix¶

Types of Weight Matrices¶

Row Standardization¶

Spatial Lag Model (SAR)¶

Specification¶

Key Features¶

Spatial Multiplier¶

Parameter Interpretation¶

Spatial Error Model (SEM)¶

Specification¶

Key Features¶

When to Use¶

Spatial Durbin Model (SDM)¶

Specification¶

Generality¶

Direct and Indirect Effects¶

General Nesting Spatial Model (GNS)¶

Specification¶

Dynamic Spatial Panels¶

Estimation Methods¶

Maximum Likelihood (ML)¶

Generalized Method of Moments¶

Quasi-Maximum Likelihood (QML)¶

Testing for Spatial Autocorrelation¶

Moran's I¶

Local Indicators of Spatial Association (LISA)¶

LM Tests for Model Selection¶

Model Comparison¶

Practical Implications¶

Key References¶

See Also¶