Heteroskedasticity Tests¶

What Is Heteroskedasticity?¶

Heteroskedasticity occurs when the variance of the error terms is not constant:

\[\text{Var}(\varepsilon_{it}) \neq \sigma^2\]

In panel data, the most common form is groupwise heteroskedasticity, where the error variance differs across entities:

\[\sigma_i^2 \neq \sigma_j^2 \quad \text{for some } i \neq j\]

This means some entities (e.g., large firms) may have systematically larger or smaller residual variance than others (e.g., small firms).

Consequences of Ignoring Heteroskedasticity

OLS coefficient estimates remain consistent but are inefficient (not minimum variance)
Classical standard errors are biased (can be too small or too large)
t-tests, F-tests, and confidence intervals are unreliable
GLS estimators assume homoskedasticity and become suboptimal

Available Tests¶

PanelBox provides three complementary tests for detecting heteroskedasticity:

Test	H₀	Type	Best For
Modified Wald	\(\sigma_i^2 = \sigma^2\) for all \(i\)	Groupwise	FE models
Breusch-Pagan	Homoskedasticity	LM (parametric)	General; identifies source
White	Homoskedasticity	LM (model-free)	No functional form assumed

When to Use Each Test¶

Fixed Effects ModelsGeneral ModelsIdentifying Sources

Use the Modified Wald test as your primary diagnostic. It is specifically designed for FE models and tests whether entity-level variances differ.

from panelbox.validation.heteroskedasticity.modified_wald import ModifiedWaldTest

test = ModifiedWaldTest(results)
result = test.run(alpha=0.05)

Use the White test when you want a model-free test with no assumptions about the form of heteroskedasticity.

from panelbox.validation.heteroskedasticity.white import WhiteTest

test = WhiteTest(results)
result = test.run(alpha=0.05, cross_terms=True)

Use the Breusch-Pagan test when you want to identify which regressors drive the heteroskedasticity.

from panelbox.validation.heteroskedasticity.breusch_pagan import BreuschPaganTest

test = BreuschPaganTest(results)
result = test.run(alpha=0.05)

Recommended Workflow¶

from panelbox import FixedEffects
from panelbox.datasets import load_grunfeld
from panelbox.validation.heteroskedasticity.modified_wald import ModifiedWaldTest
from panelbox.validation.heteroskedasticity.breusch_pagan import BreuschPaganTest
from panelbox.validation.heteroskedasticity.white import WhiteTest

# Load data and estimate model
data = load_grunfeld()
fe = FixedEffects(data, "invest", ["value", "capital"], "firm", "year")
results = fe.fit()

# Step 1: Modified Wald test (FE-specific)
mw = ModifiedWaldTest(results)
mw_result = mw.run()
print(f"Modified Wald: chi2={mw_result.statistic:.3f}, p={mw_result.pvalue:.4f}")
print(f"  Variance ratio (max/min): {mw_result.metadata['variance_ratio']:.2f}")

# Step 2: White test (model-free)
white = WhiteTest(results)
w_result = white.run(cross_terms=True)
print(f"White test:    LM={w_result.statistic:.3f}, p={w_result.pvalue:.4f}")

# Step 3: Breusch-Pagan (parametric)
bp = BreuschPaganTest(results)
bp_result = bp.run()
print(f"Breusch-Pagan: LM={bp_result.statistic:.3f}, p={bp_result.pvalue:.4f}")

# Decision
if any(r.reject_null for r in [mw_result, w_result, bp_result]):
    print("\nHeteroskedasticity detected. Use robust standard errors:")
    results_robust = fe.fit(cov_type="robust")
    print(results_robust.summary())

What to Do If Heteroskedasticity Is Detected¶

Option 1: Robust Standard Errors (HC0--HC3)¶

The simplest correction -- adjusts SE without changing coefficient estimates:

results_robust = fe.fit(cov_type="robust")    # Default HC1
results_hc3 = fe.fit(cov_type="hc3")          # HC3 (recommended for small samples)

Option 2: Clustered Standard Errors¶

Also handles serial correlation within entities:

results_cluster = fe.fit(cov_type="clustered")

Option 3: Variable Transformation¶

If the variance is proportional to a variable (e.g., firm size):

import numpy as np

# Log transformation to stabilize variance
data["log_invest"] = np.log(data["invest"])
fe_log = FixedEffects(data, "log_invest", ["value", "capital"], "firm", "year")
results_log = fe_log.fit()

Interpreting Results¶

All heteroskedasticity tests return a ValidationTestResult:

result.test_name       # Name of the test
result.statistic       # Test statistic (chi-squared or Wald)
result.pvalue          # p-value
result.df              # Degrees of freedom
result.reject_null     # True if H₀ rejected
result.conclusion      # Human-readable conclusion
result.metadata        # Test-specific details

p-value	Decision	Action
< 0.01	Strong rejection	Use robust or clustered SE
0.01 -- 0.05	Rejection	Use robust SE
0.05 -- 0.10	Borderline	Consider robust SE as precaution
> 0.10	Fail to reject	Standard SE likely adequate

Practical Advice

Even when the test fails to reject, using robust standard errors is a common practice in applied work. The cost of robustness (slight efficiency loss) is small compared to the cost of incorrect inference from biased SE.

Software Equivalents¶

PanelBox	Stata	R
`ModifiedWaldTest`	`xttest3`	Custom implementation
`BreuschPaganTest`	`estat hettest`	`lmtest::bptest()`
`WhiteTest`	`estat imtest, white`	`skedastic::white()`

References¶

Greene, W. H. (2018). Econometric Analysis (8^th ed.). Pearson, Chapter 14.
Breusch, T. S., & Pagan, A. R. (1979). "A simple test for heteroscedasticity and random coefficient variation." Econometrica, 47(5), 1287-1294.
White, H. (1980). "A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity." Econometrica, 48(4), 817-838.
Baum, C. F. (2001). "Residual diagnostics for cross-section time series regression models." Stata Journal, 1(1), 101-104.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2^nd ed.). MIT Press, Chapter 10.