Panel VAR Forecasting¶
Quick Reference
Class: panelbox.var.forecast.ForecastResult
Method: PanelVARResult.forecast()
Import: from panelbox.var import PanelVAR (forecasts accessed via results)
Stata equivalent: fcast compute
R equivalent: vars::predict()
Overview¶
Panel VAR forecasting generates multi-step-ahead predictions for all endogenous variables and all entities simultaneously. Forecasts are produced iteratively: the one-step-ahead forecast uses observed data, while longer horizons use previously generated forecasts as inputs.
For a VAR(p) system, the \(h\)-step-ahead forecast is:
where \(\hat{Y}_{i,T+h-l} = Y_{i,T+h-l}\) for \(T+h-l \le T\) (observed data) and \(\hat{Y}_{i,T+h-l}\) is the forecast for \(T+h-l > T\).
PanelBox supports both bootstrap and analytical confidence intervals to quantify forecast uncertainty.
Quick Example¶
from panelbox.var import PanelVARData, PanelVAR
# Estimate model
var_data = PanelVARData(df, endog_vars=["gdp", "inflation", "rate"],
entity_col="country", time_col="year", lags=2)
model = PanelVAR(data=var_data)
results = model.fit(cov_type="clustered")
# Generate 5-step ahead forecast with bootstrap CIs
forecast = results.forecast(
steps=5,
ci_method="bootstrap",
n_bootstrap=500,
ci_level=0.95,
)
# View summary
print(forecast.summary())
# Convert to DataFrame for a specific entity
df_fcst = forecast.to_dataframe(entity=0)
print(df_fcst)
# Plot forecast for one entity/variable
forecast.plot(entity=0, variable="gdp")
When to Use¶
- Economic forecasting: Predicting macroeconomic variables (GDP, inflation, unemployment) across countries
- Policy scenario analysis: Projecting the effects of current conditions forward
- Out-of-sample evaluation: Testing the predictive ability of the estimated model
- Business planning: Forecasting firm-level metrics across multiple business units
Key Assumptions
- Stability: The VAR must be stable (
results.is_stable() == True) for forecasts to remain bounded - Structural stability: The estimated relationships are assumed to hold in the forecast period
- No future shocks: Point forecasts assume \(\varepsilon_{T+h} = 0\); CIs account for shock uncertainty
- Accuracy degrades with horizon: Forecast uncertainty grows with \(h\)
Detailed Guide¶
Computing Forecasts¶
forecast = results.forecast(
steps=10, # Number of steps ahead
exog_future=None, # Future exogenous values (if applicable)
ci_method="bootstrap", # None, "bootstrap", or "analytical"
ci_level=0.95, # Confidence level
n_bootstrap=500, # Bootstrap replications
seed=42, # Random seed for reproducibility
)
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
steps |
int |
10 |
Number of forecast horizons |
exog_future |
np.ndarray |
None |
Future exogenous values, shape (steps, N, n_exog) |
ci_method |
str |
None |
None, "bootstrap", or "analytical" |
ci_level |
float |
0.95 |
Confidence level for intervals |
n_bootstrap |
int |
500 |
Number of bootstrap replications |
seed |
int |
None |
Random seed for reproducibility |
Confidence Interval Methods¶
The bootstrap approach resamples residuals to generate forecast paths that account for both parameter uncertainty and shock uncertainty.
forecast = results.forecast(
steps=10,
ci_method="bootstrap",
n_bootstrap=1000,
ci_level=0.95,
seed=42,
)
Procedure:
- Resample residuals with replacement
- Add resampled shocks to the iterative forecast path
- Repeat \(B\) times
- Compute percentile intervals from the distribution of forecast paths
Analytical CIs use the MA representation to compute forecast error variance:
Note
Analytical CIs only account for shock uncertainty (assuming known parameters). They are faster but may understate total uncertainty in small samples. Requires a stable VAR.
Accessing Results¶
The ForecastResult object stores forecasts for all entities and variables.
# Point forecasts: shape (steps, N, K)
print(forecast.forecasts.shape)
# Convert to DataFrame
df_all = forecast.to_dataframe() # All entities (long format)
df_usa = forecast.to_dataframe(entity="USA") # Single entity (wide format)
df_gdp = forecast.to_dataframe(entity=0, variable="gdp") # Single entity + variable
# Summary
print(forecast.summary())
ForecastResult Attributes:
| Attribute | Type | Description |
|---|---|---|
forecasts |
np.ndarray |
Point forecasts, shape (steps, N, K) |
ci_lower |
np.ndarray or None |
Lower CI bounds, shape (steps, N, K) |
ci_upper |
np.ndarray or None |
Upper CI bounds, shape (steps, N, K) |
horizon |
int |
Forecast horizon |
K |
int |
Number of variables |
N |
int |
Number of entities |
endog_names |
list[str] |
Variable names |
entity_names |
list[str] |
Entity names |
ci_level |
float |
Confidence level |
method |
str |
Forecast method |
ci_method |
str |
CI method used |
Visualization¶
# Basic forecast plot with CIs
forecast.plot(entity=0, variable="gdp", backend="plotly")
# With actual values for comparison (out-of-sample evaluation)
forecast.plot(entity=0, variable="gdp", actual=y_actual, backend="matplotlib")
The plot shows:
- Point forecast (dashed blue line with markers)
- Confidence interval band (shaded area)
- Actual values (black line, if provided)
Forecast Evaluation¶
When actual data is available, evaluate forecast accuracy:
# Evaluate against actual values
# actual should have shape (steps, N, K) or (steps, K) for single entity
accuracy = forecast.evaluate(actual=y_actual, entity=0)
print(accuracy)
Accuracy Metrics:
| Metric | Formula | Interpretation |
|---|---|---|
| RMSE | \(\sqrt{\frac{1}{H}\sum_{h=1}^{H}(y_{T+h} - \hat{y}_{T+h})^2}\) | Scale-dependent; lower is better |
| MAE | $\frac{1}{H}\sum_{h=1}^{H} | y_{T+h} - \hat{y}_{T+h} |
| MAPE | $\frac{100}{H}\sum_{h=1}^{H}\frac{ | y_{T+h} - \hat{y}_{T+h} |
Practical Considerations¶
Forecast Horizon
Forecast accuracy degrades rapidly with the horizon. A useful rule of thumb:
- Short-term (\(h \le p\)): Forecasts use mostly observed data; relatively reliable
- Medium-term (\(p < h \le 2p\)): Forecasts increasingly rely on own predictions; uncertainty grows
- Long-term (\(h > 2p\)): Dominated by the VAR dynamics; confidence intervals widen substantially
Unstable Systems
If results.is_stable() == False, forecasts will diverge (grow without bound). This is not a numerical error -- it reflects the explosive dynamics of the estimated system. Solutions:
- Difference non-stationary variables
- Use VECM for cointegrated variables
- Reduce lag order
- Check for data errors
Complete Workflow Example¶
import numpy as np
import pandas as pd
from panelbox.var import PanelVARData, PanelVAR
# Load data
df = pd.read_csv("macro_panel.csv")
# Split into training and test sets (hold out last 5 periods)
years = sorted(df["year"].unique())
cutoff = years[-5]
df_train = df[df["year"] <= cutoff]
df_test = df[df["year"] > cutoff]
# Estimate Panel VAR on training data
var_data = PanelVARData(
data=df_train,
endog_vars=["gdp_growth", "inflation", "interest_rate"],
entity_col="country",
time_col="year",
lags=2,
)
model = PanelVAR(data=var_data)
results = model.fit(cov_type="clustered")
# Check stability
print(f"Stable: {results.is_stable()}")
# Generate forecasts with bootstrap CIs
forecast = results.forecast(
steps=5,
ci_method="bootstrap",
n_bootstrap=500,
ci_level=0.95,
seed=42,
)
print(forecast.summary())
# View forecast for a specific entity
df_fcst = forecast.to_dataframe(entity=0)
print(df_fcst)
# Plot forecast with uncertainty bands
forecast.plot(entity=0, variable="gdp_growth", backend="matplotlib")
Tutorials¶
| Tutorial | Description | Link |
|---|---|---|
| VAR Forecasting | Multi-step forecasting and evaluation |
See Also¶
- Panel VAR Estimation -- Model setup and estimation
- Impulse Response Functions -- Dynamic effects of shocks
- FEVD -- Variance decomposition
- Granger Causality -- Predictive relationships
- VECM -- Forecasting with cointegrated systems
References¶
- Luetkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Springer-Verlag.
- Abrigo, M. R., & Love, I. (2016). Estimation of panel vector autoregression in Stata. The Stata Journal, 16(3), 778-804.
- Holtz-Eakin, D., Newey, W., & Rosen, H. S. (1988). Estimating vector autoregressions with panel data. Econometrica, 56(6), 1371-1395.