Modified Wald Test¶

Quick Reference

Class: panelbox.validation.heteroskedasticity.modified_wald.ModifiedWaldTest H₀: \(\sigma_i^2 = \sigma^2\) for all \(i = 1, \ldots, N\) (homoskedastic across entities) H₁: \(\sigma_i^2 \neq \sigma_j^2\) for at least one pair (groupwise heteroskedasticity) Statistic: Wald ~ \(\chi^2(N)\) Stata equivalent: xttest3 R equivalent: Custom implementation

What It Tests¶

The Modified Wald test detects groupwise heteroskedasticity in fixed effects panel models. It tests whether the error variance is the same across all cross-sectional entities or varies by group.

This is the most relevant heteroskedasticity test for FE models because it directly addresses the question: "Do different entities have different error variances?"

Quick Example¶

from panelbox import FixedEffects
from panelbox.datasets import load_grunfeld
from panelbox.validation.heteroskedasticity.modified_wald import ModifiedWaldTest

# Estimate Fixed Effects model
data = load_grunfeld()
fe = FixedEffects(data, "invest", ["value", "capital"], "firm", "year")
results = fe.fit()

# Run Modified Wald test
test = ModifiedWaldTest(results)
result = test.run(alpha=0.05)

print(f"Wald statistic: {result.statistic:.3f}")
print(f"P-value:        {result.pvalue:.4f}")
print(f"Degrees of freedom: {result.df}")
print(result.conclusion)

# Examine variance heterogeneity
meta = result.metadata
print(f"N entities:       {meta['n_entities']}")
print(f"Pooled variance:  {meta['pooled_variance']:.4f}")
print(f"Min entity var:   {meta['min_entity_var']:.4f}")
print(f"Max entity var:   {meta['max_entity_var']:.4f}")
print(f"Variance ratio:   {meta['variance_ratio']:.2f}")

Interpretation¶

Jointly Using the p-value and Variance Ratio¶

The test provides both a formal statistical test and a descriptive measure of heterogeneity:

p-value	Variance Ratio	Interpretation
< 0.01	> 10	Strong heteroskedasticity -- use robust SE or FGLS
0.01 -- 0.05	5 -- 10	Moderate heteroskedasticity -- use robust SE
0.05 -- 0.10	2 -- 5	Weak heteroskedasticity -- consider robust SE
> 0.10	< 2	No evidence -- standard SE adequate

Variance Ratio Guidelines

The variance ratio (max_entity_var / min_entity_var) provides practical context:

Ratio < 2: Variances are relatively homogeneous
Ratio 2--5: Moderate heterogeneity
Ratio 5--10: Strong heterogeneity; robust SE recommended
Ratio > 10: Very strong heterogeneity; consider FGLS

A significant p-value with a small variance ratio (< 2) may indicate the test is detecting statistically significant but practically negligible differences due to large sample size.

Mathematical Details¶

Entity-Level Variances¶

For each entity \(i\), the entity-specific variance is estimated as:

\[\hat{\sigma}_i^2 = \frac{1}{T_i - 1} \sum_{t=1}^{T_i} \hat{e}_{it}^2\]

where \(\hat{e}_{it}\) are the fixed effects residuals.

Pooled Variance¶

The pooled variance is:

\[\hat{\sigma}^2 = \frac{\sum_{i=1}^N \sum_{t=1}^{T_i} \hat{e}_{it}^2}{nT - N - k}\]

where \(nT\) is the total number of observations, \(N\) is the number of entities, and \(k\) is the number of estimated parameters.

Wald Statistic¶

The test statistic is:

\[W = \sum_{i=1}^{N} T_i \ln\left(\frac{\hat{\sigma}^2}{\hat{\sigma}_i^2}\right) \sim \chi^2(N)\]

Under \(H_0\), all entity variances equal the pooled variance and \(W\) is approximately chi-squared distributed with \(N\) degrees of freedom.

Configuration Options¶

Parameter	Type	Default	Description
`alpha`	`float`	`0.05`	Significance level

Result Metadata¶

Key	Type	Description
`n_entities`	`int`	Number of entities (N)
`pooled_variance`	`float`	Pooled error variance \(\hat{\sigma}^2\)
`min_entity_var`	`float`	Minimum entity-level variance
`max_entity_var`	`float`	Maximum entity-level variance
`variance_ratio`	`float`	Ratio of max to min entity variance

Diagnostics¶

Comparing Standard Error Corrections¶

When the test rejects homoskedasticity, compare how different SE corrections change inference:

if result.reject_null:
    # Re-estimate with different SE
    results_std = fe.fit()
    results_robust = fe.fit(cov_type="robust")
    results_cluster = fe.fit(cov_type="clustered")

    print("Variable   | SE(std)  | SE(robust) | SE(cluster) | Ratio(r/s)")
    print("-----------|----------|------------|-------------|----------")
    for var in ["value", "capital"]:
        se_s = results_std.std_errors[var]
        se_r = results_robust.std_errors[var]
        se_c = results_cluster.std_errors[var]
        print(f"{var:10s} | {se_s:.4f}  | {se_r:.4f}    | "
              f"{se_c:.4f}     | {se_r/se_s:.2f}")

Reading the Comparison

Ratio > 1: Robust SE are larger than standard SE, suggesting standard SE are downward biased (the common case with heteroskedasticity)
Ratio < 1: Standard SE are already conservative; heteroskedasticity inflates standard SE for some variables
Large differences (ratio > 1.5 or < 0.7): Heteroskedasticity has substantial impact on inference

Common Pitfalls¶

Common Pitfalls

FE models only: The test is designed for Fixed Effects models. Using it with other model types triggers a warning and may produce unreliable results.
Entities with zero variance: If any entity has zero residual variance (perfect fit within entity), it is skipped in the computation. This can happen with entities having very few observations.
Large N: With many entities, the chi-squared approximation with \(N\) degrees of freedom works well, but the test becomes very powerful and may detect trivially small differences.
Statistical vs. practical significance: With large panels, the test will almost always reject. Always check the variance ratio alongside the p-value to assess practical significance.
Unbalanced panels: The test handles unbalanced panels through entity-specific \(T_i\) weights, but entities with very few periods will have imprecisely estimated variances.

References¶

Greene, W. H. (2000). Econometric Analysis (4^th ed.). Prentice Hall, Chapter 14.
Baum, C. F. (2001). "Residual diagnostics for cross-section time series regression models." Stata Journal, 1(1), 101-104.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2^nd ed.). MIT Press, Chapter 10.