Skip to content

Static Models API Reference

Module

Import: from panelbox.models.static import PooledOLS, FixedEffects, RandomEffects, BetweenEstimator, FirstDifferenceEstimator, MeanGroupEstimator, PooledMeanGroupEstimator, SUR Source: panelbox/models/static/

Overview

Static panel models are the workhorses of panel data econometrics. All estimators share a consistent interface: construct with a formula and data, then call .fit() to obtain results.

Estimator Description Use Case
PooledOLS Ordinary least squares ignoring panel structure Baseline comparison
FixedEffects Within estimator eliminating entity-specific intercepts Time-invariant unobserved heterogeneity
RandomEffects GLS with random entity effects Uncorrelated unobserved effects
BetweenEstimator OLS on entity means Cross-sectional variation
FirstDifferenceEstimator OLS on first-differenced data Alternative to FE for T=2
MeanGroupEstimator Average of entity-specific OLS regressions Slope heterogeneity (Pesaran & Smith 1995)
PooledMeanGroupEstimator ECM with homogeneous long-run coefficients Long-run homogeneity + short-run heterogeneity (Pesaran, Shin & Smith 1999)
SUR Seemingly Unrelated Regressions (Zellner 1962) Cross-equation correlated errors + different regressors

Common Constructor Pattern

All static models share the same constructor signature:

ModelClass(
    formula: str,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    weights: np.ndarray | None = None,
)
Parameter Type Default Description
formula str required R-style formula, e.g. "y ~ x1 + x2"
data pd.DataFrame required Panel data DataFrame
entity_col str required Column identifying entities
time_col str required Column identifying time periods
weights np.ndarray \| None None Observation weights for WLS

Common .fit() Method

model.fit(cov_type: str = "nonrobust", **cov_kwds) -> PanelResults
Parameter Type Default Description
cov_type str "nonrobust" Covariance estimator type
**cov_kwds dict Additional keyword arguments for the covariance estimator

Available cov_type Options

Value Description
"nonrobust" Classical OLS/GLS standard errors
"robust" Heteroskedasticity-robust (HC1)
"hc0" -- "hc3" White heteroskedasticity-consistent variants
"clustered" Cluster-robust by entity (default clustering)
"twoway" Two-way clustering by entity and time
"driscoll_kraay" Driscoll-Kraay (cross-sectionally robust)
"newey_west" Newey-West HAC
"pcse" Panel-corrected standard errors (Beck-Katz)

Returns: PanelResults


Classes

PooledOLS

Pooled Ordinary Least Squares. Treats all observations as independent, ignoring the panel structure. Useful as a baseline for comparison with panel estimators.

PooledOLS(
    formula: str,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    weights: np.ndarray | None = None,
)

Example

from panelbox import PooledOLS, load_grunfeld

data = load_grunfeld()
model = PooledOLS("invest ~ value + capital", data, "firm", "year")
result = model.fit(cov_type="clustered")
result.summary()

FixedEffects

Within estimator that eliminates entity-specific (and optionally time-specific) fixed effects by demeaning.

FixedEffects(
    formula: str,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    entity_effects: bool = True,
    time_effects: bool = False,
    weights: np.ndarray | None = None,
)
Parameter Type Default Description
entity_effects bool True Include entity fixed effects
time_effects bool False Include time fixed effects (two-way FE)

When to use Fixed Effects

Use FE when you suspect unobserved entity-level heterogeneity is correlated with the regressors. The Hausman test can help decide between FE and RE.

Example

from panelbox import FixedEffects, load_grunfeld

data = load_grunfeld()

# One-way entity FE
fe = FixedEffects("invest ~ value + capital", data, "firm", "year")
result = fe.fit(cov_type="robust")

# Two-way FE (entity + time)
fe2 = FixedEffects(
    "invest ~ value + capital", data, "firm", "year",
    entity_effects=True, time_effects=True
)
result2 = fe2.fit(cov_type="clustered")

RandomEffects

GLS estimator with random entity effects. Uses the Swamy-Arora variance decomposition by default.

RandomEffects(
    formula: str,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    variance_estimator: str = "swamy-arora",
    weights: np.ndarray | None = None,
)
Parameter Type Default Description
variance_estimator str "swamy-arora" Method for estimating variance components

When to use Random Effects

Use RE when unobserved heterogeneity is uncorrelated with the regressors. RE is more efficient than FE under this assumption. Verify with the Hausman test.

Example

from panelbox import RandomEffects, load_grunfeld

data = load_grunfeld()
model = RandomEffects("invest ~ value + capital", data, "firm", "year")
result = model.fit()
result.summary()

BetweenEstimator

OLS regression on entity means (cross-sectional variation only). Estimates the relationship using between-entity variation by averaging all observations within each entity.

BetweenEstimator(
    formula: str,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    weights: np.ndarray | None = None,
)

Example

from panelbox import BetweenEstimator, load_grunfeld

data = load_grunfeld()
model = BetweenEstimator("invest ~ value + capital", data, "firm", "year")
result = model.fit()
result.summary()

FirstDifferenceEstimator

OLS on first-differenced data. Eliminates entity fixed effects by differencing consecutive observations. Equivalent to Fixed Effects when T=2.

FirstDifferenceEstimator(
    formula: str,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    weights: np.ndarray | None = None,
)

FD vs FE

First Difference uses only adjacent-period variation, while FE uses all within-entity variation. FD is more robust to serial correlation in errors but less efficient when errors are not a random walk.

Example

from panelbox import FirstDifferenceEstimator, load_grunfeld

data = load_grunfeld()
model = FirstDifferenceEstimator("invest ~ value + capital", data, "firm", "year")
result = model.fit(cov_type="robust")
result.summary()

Comparison Example

from panelbox import (
    PooledOLS, FixedEffects, RandomEffects,
    BetweenEstimator, FirstDifferenceEstimator,
    load_grunfeld,
)

data = load_grunfeld()
formula = "invest ~ value + capital"

models = {
    "Pooled OLS": PooledOLS(formula, data, "firm", "year"),
    "Fixed Effects": FixedEffects(formula, data, "firm", "year"),
    "Random Effects": RandomEffects(formula, data, "firm", "year"),
    "Between": BetweenEstimator(formula, data, "firm", "year"),
    "First Difference": FirstDifferenceEstimator(formula, data, "firm", "year"),
}

for name, model in models.items():
    result = model.fit()
    print(f"{name:20s}  R-sq={result.rsquared:.4f}  N={result.nobs}")

MeanGroupEstimator

Mean Group estimator (Pesaran & Smith, 1995). Estimates entity-specific OLS regressions and averages coefficients across entities. Consistent under slope heterogeneity where FE/RE are inconsistent.

MeanGroupEstimator(
    formula: str,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    weights: np.ndarray | None = None,
    min_obs_per_entity: int = 10,
)
Parameter Type Default Description
formula str required R-style formula, e.g. "y ~ x1 + x2"
data pd.DataFrame required Panel data DataFrame
entity_col str required Column identifying entities
time_col str required Column identifying time periods
weights np.ndarray \| None None Entity-level weights for weighted MG estimation
min_obs_per_entity int 10 Minimum observations per entity to include

Returns: MeanGroupResults (extends PanelResults)

MeanGroupResults

Attribute Type Description
entity_params dict {entity_id: pd.Series} of entity-specific coefficients
entity_std_errors dict {entity_id: pd.Series} of entity-specific standard errors
entity_rsquared dict {entity_id: float} of entity-specific R-squared
n_entities_used int Number of entities used in estimation
entities_excluded list Entities excluded due to insufficient observations
swamy_test_result dict Swamy (1970) slope homogeneity test (statistic, p_value, df)
Method Description
entity_summary(entity_id) Print OLS summary for a single entity
coefficient_table() DataFrame of all entity-specific coefficients
plot_coefficient_distribution(variable) Boxplot of entity coefficients for a variable

Example

from panelbox.models.static import MeanGroupEstimator

data = ...  # panel DataFrame with columns: country, year, y, x1, x2
mg = MeanGroupEstimator("y ~ x1 + x2", data, "country", "year")
result = mg.fit()

# Average coefficients and Swamy test
result.summary()
print(result.swamy_test_result)

# Entity-level detail
result.coefficient_table()
result.entity_summary("USA")
result.plot_coefficient_distribution("x1")

PooledMeanGroupEstimator

Pooled Mean Group estimator (Pesaran, Shin & Smith, 1999). Estimates an error-correction model where long-run coefficients are homogeneous across entities while short-run dynamics are entity-specific.

PooledMeanGroupEstimator(
    formula: str,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    lags: int = 1,
    max_iter: int = 100,
    tol: float = 1e-5,
)
Parameter Type Default Description
formula str required R-style formula, e.g. "y ~ x1 + x2"
data pd.DataFrame required Panel data DataFrame
entity_col str required Column identifying entities
time_col str required Column identifying time periods
lags int 1 Number of lags for short-run dynamics
max_iter int 100 Maximum optimization iterations
tol float 1e-5 Convergence tolerance

Example

from panelbox.models.static import PooledMeanGroupEstimator

pmg = PooledMeanGroupEstimator(
    "y ~ x1 + x2", data, "country", "year", lags=1
)
result = pmg.fit()
result.summary()

hausman_mg_pmg

Hausman test comparing MG and PMG estimators. Tests whether the long-run homogeneity restriction imposed by PMG is valid.

from panelbox.models.static import hausman_mg_pmg

test = hausman_mg_pmg(mg_result, pmg_result)
# Returns dict with 'statistic', 'p_value', 'df'
Result Interpretation
Fail to reject \(H_0\) PMG preferred (efficient + consistent)
Reject \(H_0\) MG preferred (consistent under heterogeneity)

SUR

Seemingly Unrelated Regressions (Zellner, 1962). Treats each entity as a separate equation in a system with cross-equation correlated errors. Estimates via Feasible GLS using the Kronecker structure of the covariance matrix.

SUR(
    formula: str | dict,
    data: pd.DataFrame,
    entity_col: str,
    time_col: str,
    homogeneous: bool = False,
    iterate: bool = False,
    max_iter: int = 100,
    tol: float = 1e-6,
)
Parameter Type Default Description
formula str \| dict required R-style formula (str) applied to all entities, or dict mapping entity IDs to per-entity formulas
data pd.DataFrame required Panel data DataFrame
entity_col str required Column identifying entities (each entity = one equation)
time_col str required Column identifying time periods
homogeneous bool False If True, constrain all entities to share the same coefficients
iterate bool False If True, iterate FGLS until convergence (ISUR ≈ MLE)
max_iter int 100 Maximum iterations for iterated SUR
tol float 1e-6 Convergence tolerance (relative change in beta)

When to use SUR

SUR is most useful when entities have different regressors and correlated errors. When all entities share the same regressors, SUR point estimates equal OLS — no efficiency gain. Use the Breusch-Pagan test in the results to verify that cross-equation correlation is significant.

.fit() Method

sur.fit() -> SURResults

The SUR .fit() method does not take cov_type — covariances are determined by the GLS structure.

SURResults

Extends PanelResults with system-level diagnostics.

Attribute Type Description
entity_params dict {entity_id: pd.Series} per-entity SUR coefficient estimates
entity_std_errors dict {entity_id: pd.Series} per-entity standard errors
entity_rsquared dict {entity_id: float} per-entity R-squared
sigma_matrix np.ndarray Cross-equation covariance matrix (\(N \times N\))
correlation_matrix np.ndarray Cross-equation correlation matrix (\(N \times N\))
system_rsquared float McElroy (1977) system R-squared
n_iterations int FGLS iterations performed (0 if not iterated)
converged bool Whether iterated SUR converged
ols_params dict {entity_id: pd.Series} pre-GLS OLS coefficients
efficiency_gain dict {entity_id: pd.Series} ratio SE(SUR)/SE(OLS) — values < 1 indicate gain
bp_independence_test dict Breusch-Pagan test (statistic, pvalue, df)
Method Description
system_summary() Print system-level summary with Sigma, BP test, efficiency gains
equation_summary(entity_id) Print SUR vs OLS comparison for a single entity
plot_correlation_matrix(ax=None) Heatmap of cross-equation correlations

Examples

String formula (same for all entities):

import panelbox as pb

data = pb.load_grunfeld()
sur = pb.SUR("invest ~ value + capital", data, "firm", "year")
result = sur.fit()
print(result.system_summary())

# Check if SUR helped
print(f"BP test p-value: {result.bp_independence_test['pvalue']:.4f}")

Dict formula (per-entity specifications):

sur = pb.SUR(
    formula={
        "General Motors": "invest ~ value + capital",
        "Chrysler": "invest ~ value",
        "General Electric": "invest ~ capital",
    },
    data=data,
    entity_col="firm",
    time_col="year",
)
result = sur.fit()
result.equation_summary("General Motors")

Iterated SUR (MLE-equivalent under normality):

sur = pb.SUR(
    "invest ~ value + capital", data, "firm", "year",
    iterate=True, max_iter=200, tol=1e-8,
)
result = sur.fit()
print(f"Converged: {result.converged}, iterations: {result.n_iterations}")

See Also