Mundlak Test¶
Quick Reference
Class: panelbox.validation.specification.mundlak.MundlakTest
Result: ValidationTestResult
H₀: RE is consistent (entity effects uncorrelated with regressors, \(\gamma = 0\))
H₁: RE is inconsistent (use Fixed Effects, \(\gamma \neq 0\))
Stata equivalent: Manually add group means to RE regression
R equivalent: plm::phtest(model, method = "aux")
What It Tests¶
The Mundlak test is an alternative to the Hausman test for choosing between Fixed Effects and Random Effects. Instead of comparing two sets of estimates, it augments the RE model with entity-level means of the time-varying regressors and tests whether those means are jointly significant.
The intuition is straightforward: if \(\alpha_i\) is correlated with \(X_{it}\), then the entity means \(\bar{X}_i\) will capture this correlation. If \(\bar{X}_i\) is significant in the augmented model, the RE assumption is violated.
The Mundlak Device¶
The standard RE model is:
The Mundlak augmented model adds entity means:
where \(\bar{X}_i = \frac{1}{T_i} \sum_{t=1}^{T_i} X_{it}\) is the time average of regressors for entity \(i\).
If \(\gamma = 0\): the group means add no information, confirming RE is appropriate.
If \(\gamma \neq 0\): the regressors are correlated with the individual effects, and FE should be used.
Quick Example¶
from panelbox.models.static.random_effects import RandomEffects
from panelbox.validation.specification.mundlak import MundlakTest
from panelbox.datasets import load_grunfeld
data = load_grunfeld()
# Estimate Random Effects model
re = RandomEffects("invest ~ value + capital", data, "firm", "year")
re_results = re.fit()
# Run Mundlak test
mundlak = MundlakTest(re_results)
result = mundlak.run(alpha=0.05)
print(result.summary())
# Programmatic access
print(f"Wald statistic: {result.statistic:.4f}")
print(f"P-value: {result.pvalue:.4f}")
print(f"Degrees of freedom: {result.df}")
print(f"Reject H0: {result.reject_null}")
Interpretation¶
| P-value | Decision | Interpretation | Action |
|---|---|---|---|
| p < 0.01 | Strong rejection | Strong evidence of correlated effects | Use Fixed Effects |
| 0.01 \(\leq\) p < 0.05 | Rejection | Moderate evidence of correlation | Use Fixed Effects |
| 0.05 \(\leq\) p < 0.10 | Borderline | Weak evidence | Report both; lean toward FE |
| p \(\geq\) 0.10 | Fail to reject | No evidence of correlation | Use Random Effects |
Examining Individual Group Means¶
The test metadata provides the coefficients on individual group-mean variables, revealing which regressors drive the correlation:
result = mundlak.run(alpha=0.05)
# Coefficients on group means
for var, coef in result.metadata["delta_coefficients"].items():
se = result.metadata["standard_errors"][var]
t_stat = coef / se if se > 0 else 0
print(f" {var}: coef={coef:.4f}, se={se:.4f}, t={t_stat:.2f}")
Mathematical Details¶
Wald Test Statistic¶
The test statistic is a Wald test for the joint significance of \(\gamma\):
where \(K\) is the number of time-varying regressors.
Equivalence to Hausman¶
Mundlak (1978) showed that the Correlated Random Effects (CRE) model:
yields \(\hat{\beta}\) identical to the FE estimator when \(\gamma \neq 0\). The test on \(\gamma\) is asymptotically equivalent to the Hausman test but is computed differently, offering practical advantages.
Implementation Details¶
PanelBox implements the Mundlak test using Pooled OLS with entity-clustered standard errors on the augmented model. This approach:
- Adds entity-mean variables for all time-varying regressors
- Estimates the augmented model with cluster-robust standard errors
- Performs a Wald test on the joint significance of the mean variables
Implementation Note
PanelBox uses Pooled OLS with clustered SEs (rather than RE estimation on the augmented model) to avoid numerical issues with variables that are constant within entities. This produces results consistent with the auxiliary regression approach used in R's plm package.
Configuration Options¶
from panelbox.validation.specification.mundlak import MundlakTest
# Basic usage
mundlak = MundlakTest(re_results)
result = mundlak.run(alpha=0.05)
# Stricter significance level
result = mundlak.run(alpha=0.01)
Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
results |
PanelResults |
required | Results from Random Effects estimation |
The .run() method accepts:
| Parameter | Type | Default | Description |
|---|---|---|---|
alpha |
float |
0.05 |
Significance level |
Result Metadata¶
| Key | Type | Description |
|---|---|---|
n_time_varying_vars |
int |
Number of time-varying regressors |
delta_coefficients |
dict |
Coefficients on group-mean variables |
standard_errors |
dict |
Standard errors for group-mean coefficients |
F_statistic |
float |
F-statistic (Wald / df) |
augmented_formula |
str |
Formula used for augmented regression |
Advantages Over Hausman¶
The Mundlak test offers several practical advantages:
| Feature | Hausman | Mundlak |
|---|---|---|
| Models required | FE and RE | RE only |
| Positive test statistic | Not guaranteed | Always (Wald test) |
| Robust standard errors | Problematic | Fully compatible |
| Identifies which variable | No | Yes (individual \(\hat{\gamma}_k\)) |
| Time-invariant regressors | Excluded from test | Handled naturally |
| Unbalanced panels | Potential issues | Straightforward |
When to Prefer Mundlak
Use the Mundlak test when:
- You want to identify which regressors are correlated with entity effects
- The Hausman test produces a negative statistic
- You need robust/clustered standard errors for the test
- You only have RE results available
Comparing Hausman and Mundlak¶
from panelbox.models.static.fixed_effects import FixedEffects
from panelbox.models.static.random_effects import RandomEffects
from panelbox.validation.specification.hausman import HausmanTest
from panelbox.validation.specification.mundlak import MundlakTest
from panelbox.datasets import load_grunfeld
data = load_grunfeld()
# Estimate both models
fe = FixedEffects("invest ~ value + capital", data, "firm", "year")
fe_results = fe.fit()
re = RandomEffects("invest ~ value + capital", data, "firm", "year")
re_results = re.fit()
# Hausman test
hausman = HausmanTest(fe_results, re_results)
print(f"Hausman: chi2={hausman.statistic:.4f}, p={hausman.pvalue:.4f}")
print(f" Recommendation: {hausman.recommendation}")
# Mundlak test
mundlak = MundlakTest(re_results)
mundlak_result = mundlak.run(alpha=0.05)
print(f"Mundlak: Wald={mundlak_result.statistic:.4f}, p={mundlak_result.pvalue:.4f}")
print(f" Conclusion: {mundlak_result.conclusion}")
# Both tests should agree in most cases
Common Pitfalls¶
RE Model Required¶
The Mundlak test is only applicable to Random Effects models. Passing FE or Pooled OLS results will raise a ValueError:
Time-Invariant Regressors¶
If all regressors are time-invariant (no within-entity variation), the test cannot be computed because there are no group means to add. Ensure at least one regressor varies over time.
Small Samples¶
With few entities or short time series, the Wald test may have limited power. Consider:
- Using a higher significance level (e.g., \(\alpha = 0.10\))
- Reporting both FE and RE estimates regardless of test outcome
- Supplementing with economic reasoning about likely endogeneity
See Also¶
- Hausman Test -- Classical FE vs RE test
- RESET Test -- Functional form specification
- Specification Tests Overview -- All specification tests
- Diagnostics Overview -- Complete diagnostic workflow
References¶
- Mundlak, Y. (1978). On the pooling of time series and cross section data. Econometrica, 46(1), 69-85.
- Chamberlain, G. (1982). Multivariate regression models for panel data. Journal of Econometrics, 18(1), 5-46.
- Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press. Chapter 10.
- Baltagi, B. H. (2021). Econometric Analysis of Panel Data (6th ed.). Springer. Chapter 4.