Skip to content

Diagnostics & Validation

Panel data models rest on assumptions about error structure, functional form, and variable selection. Diagnostic tests verify whether these assumptions hold in your data, guiding you toward valid inference and reliable conclusions.

Why Diagnostics Matter

Econometric estimates are only as credible as the assumptions underlying the model. Violations lead to:

  • Biased coefficients (omitted variables, wrong functional form)
  • Invalid standard errors (heteroskedasticity, serial correlation, cross-sectional dependence)
  • Inconsistent estimates (endogeneity, non-stationarity)

PanelBox provides 50+ diagnostic tests organized into six categories, all returning consistent result objects for programmatic use.

Diagnostic Workflow

A disciplined testing workflow proceeds from broad model choice to specific assumption checks:

Step 1: Specification       Is the model correctly specified?
    |
Step 2: Serial Correlation  Are errors autocorrelated?
    |
Step 3: Heteroskedasticity  Is variance constant across entities?
    |
Step 4: Cross-sectional     Are entities correlated?
        Dependence
    |
Step 5: Stationarity        Are variables stationary?
    |
Step 6: Cointegration       Do non-stationary variables share
                            a long-run equilibrium?

Step 1: Specification Tests

Determine whether the model is correctly specified before examining residual properties.

Test Question When to Use
Hausman Fixed Effects or Random Effects? After estimating FE and RE
Mundlak Are random effects correlated with regressors? Alternative to Hausman
RESET Is the functional form correct? Any linear model
Chow Do parameters change over time? Suspected structural break
J-Test Which non-nested model is better? Comparing alternative specifications
Cox / Encompassing Does one model encompass another? Likelihood-based model comparison

Specification Tests Overview

Step 2: Serial Correlation Tests

Test whether errors within an entity are correlated across time periods.

Test Detects Best For
Wooldridge AR First-order autocorrelation FE models
Breusch-Godfrey Higher-order autocorrelation Any model
Baltagi-Wu LBI Autocorrelation in unbalanced panels Unbalanced panels

Serial Correlation Tests

Step 3: Heteroskedasticity Tests

Test whether the error variance is constant across entities and time.

Test Detects Best For
Modified Wald Groupwise heteroskedasticity FE models
Breusch-Pagan Heteroskedasticity linked to regressors Any model
White General heteroskedasticity Any model

Heteroskedasticity Tests

Step 4: Cross-Sectional Dependence Tests

Test whether residuals are correlated across entities at a given time period.

Test Approach Best For
Pesaran CD Average pairwise correlations Large N panels
Breusch-Pagan LM Sum of squared correlations Small N panels
Frees Non-parametric Robust to non-normality

Cross-Sectional Dependence Tests

Step 5: Unit Root Tests

Test whether panel variables are stationary or contain unit roots.

Test H₀ Approach
LLC Common unit root Pooled ADF
IPS Individual unit roots Average ADF
Fisher Individual unit roots Combines p-values

Unit Root Tests

Step 6: Cointegration Tests

For non-stationary variables, test whether a long-run equilibrium relationship exists.

Test Approach Statistics
Pedroni Residual-based 7 statistics (panel + group)
Kao Residual-based ADF-based
Westerlund Error-correction 4 statistics with bootstrap

Cointegration Tests

ValidationSuite: One-Line Comprehensive Testing

The ValidationSuite runs all applicable tests on a model result in a single call:

from panelbox.validation import ValidationSuite

suite = ValidationSuite(results)
report = suite.run(tests="all", alpha=0.05)
print(report)

Selective Testing

Run specific test categories:

# Run only specification tests
spec_results = suite.run_specification_tests(alpha=0.05)

# Run only serial correlation tests
serial_results = suite.run_serial_correlation_tests(alpha=0.05)

# Run only heteroskedasticity tests
het_results = suite.run_heteroskedasticity_tests(alpha=0.05)

# Run only cross-sectional dependence tests
cd_results = suite.run_cross_sectional_tests(alpha=0.05)

Test Selection Options

Option Tests Run
"all" Specification + Serial + Heteroskedasticity + Cross-sectional
"default" Recommended tests for the model type
"serial" Serial correlation tests only
"het" Heteroskedasticity tests only
"cd" Cross-sectional dependence tests only

Default Tests by Model Type

  • Fixed Effects: Serial correlation + Heteroskedasticity + Cross-sectional
  • Random Effects: Cross-sectional dependence
  • Pooled OLS: Heteroskedasticity + Cross-sectional

Common Result Pattern

All diagnostic tests in PanelBox return objects with a consistent interface:

from panelbox.validation.base import ValidationTestResult

# Every test result provides:
result.test_name            # str   -- Name of the test
result.statistic            # float -- Test statistic value
result.pvalue               # float -- P-value
result.df                   # int, tuple, or None -- Degrees of freedom
result.alpha                # float -- Significance level used
result.null_hypothesis      # str   -- H₀ description
result.alternative_hypothesis  # str -- H₁ description
result.reject_null          # bool  -- Whether to reject at alpha
result.conclusion           # str   -- Human-readable interpretation
result.metadata             # dict  -- Additional test-specific info

# Formatted output
print(result.summary())

Interpreting Test Results

The interpretation logic is the same across all tests:

Condition Meaning Action
p-value < \(\alpha\) Reject H₀ Assumption violated -- take corrective action
p-value \(\geq \alpha\) Fail to reject H₀ No evidence against assumption

Common Misconception

Failing to reject H₀ does not prove the assumption holds. It means the data does not provide sufficient evidence against it at the chosen significance level.

Decision Tree: Which Test to Run

Is this a static panel model (FE/RE/Pooled)?
├── Yes
│   ├── Need to choose FE vs RE?
│   │   ├── Yes → Hausman Test or Mundlak Test
│   │   └── No  → Skip
│   ├── Check functional form → RESET Test
│   ├── Check serial correlation → Wooldridge AR Test
│   ├── Check heteroskedasticity → Modified Wald / Breusch-Pagan
│   └── Check cross-sectional dependence → Pesaran CD
├── Is this a GMM model?
│   ├── Check instrument validity → Hansen J Test
│   ├── Check serial correlation → AR(1)/AR(2) Tests
│   └── System GMM? → Difference-in-Hansen Test
└── Comparing alternative specifications?
    ├── Nested models → Likelihood Ratio Test / Wald Test
    └── Non-nested models → J-Test / Cox Test

Quick Reference Table

Test H₀ Good Result Bad Result Fix
Hausman RE consistent p \(\geq\) 0.05 (use RE) p < 0.05 (use FE) Switch to FE
RESET Correct spec p \(\geq\) 0.05 p < 0.05 Add nonlinear terms
Wooldridge No AR(1) p \(\geq\) 0.05 p < 0.05 Robust/DK SEs
Modified Wald Homoskedastic p \(\geq\) 0.05 p < 0.05 Robust SEs
Pesaran CD No CD p \(\geq\) 0.05 p < 0.05 DK or PCSE SEs
LLC/IPS Unit root p < 0.05 (stationary) p \(\geq\) 0.05 Difference or cointegration

Complete Testing Workflow Example

from panelbox.models.static.fixed_effects import FixedEffects
from panelbox.models.static.random_effects import RandomEffects
from panelbox.validation.specification.hausman import HausmanTest
from panelbox.validation import ValidationSuite

# Step 1: Estimate FE and RE models
fe = FixedEffects("invest ~ value + capital", data, "firm", "year")
fe_results = fe.fit()

re = RandomEffects("invest ~ value + capital", data, "firm", "year")
re_results = re.fit()

# Step 2: Hausman test for model selection
hausman = HausmanTest(fe_results, re_results)
print(hausman.summary())
# Use the recommended model
chosen_results = fe_results if hausman.reject_null else re_results

# Step 3: Run comprehensive diagnostics on chosen model
suite = ValidationSuite(chosen_results)
report = suite.run(tests="all", alpha=0.05)
print(report)

Software Comparison

Test Category PanelBox Stata R
Specification HausmanTest, MundlakTest, RESETTest hausman, estat ovtest plm::phtest(), lmtest::resettest()
Serial Correlation WooldridgeARTest, BreuschGodfreyTest xtserial plm::pbsytest(), plm::pwartest()
Heteroskedasticity ModifiedWaldTest, WhiteTest estat hettest plm::pcdtest()
Cross-sectional PesaranCDTest, FreesTest xtcsd plm::pcdtest()
Unit Root LLCTest, IPSTest, FisherTest xtunitroot plm::purtest()
Cointegration PedroniTest, KaoTest xtpedroni, xtcointtest plm::cipstest()

See Also

References

  • Baltagi, B. H. (2021). Econometric Analysis of Panel Data (6th ed.). Springer.
  • Cameron, A. C., & Trivedi, P. K. (2005). Microeconometrics: Methods and Applications. Cambridge University Press.
  • Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press.