Four-Component SFA¶
Quick Reference
Class: panelbox.frontier.advanced.FourComponentSFA
Import: from panelbox.frontier.advanced import FourComponentSFA
Stata equivalent: sfpanel (user-written extensions)
R equivalent: sfaR::sfacross() (partial support)
Unique in Python
PanelBox is the only Python library that implements the four-component SFA model. This model is available in Stata through user-written extensions and partially in R through the sfaR package, but PanelBox provides the most complete Python implementation.
Overview¶
Standard SFA models estimate a single inefficiency term \(u_{it}\) that combines all sources of underperformance. However, in practice, inefficiency has distinct components with different policy implications:
- Persistent inefficiency (\(\eta_i\)): structural, long-run inefficiency due to organizational design, corporate culture, or institutional constraints. This component is time-invariant and requires fundamental reforms to address.
- Transient inefficiency (\(u_{it}\)): short-run, time-varying inefficiency due to temporary managerial decisions, demand shocks, or operational disruptions. This component can be addressed through day-to-day management improvements.
The four-component model (Kumbhakar, Lien & Hardaker, 2014; Colombi et al., 2014) decomposes the error into four parts, providing a richer picture of firm performance and more targeted policy recommendations.
Quick Example¶
from panelbox.frontier.advanced import FourComponentSFA
model = FourComponentSFA(
data=panel_df,
depvar="log_output",
exog=["log_labor", "log_capital"],
entity="firm_id",
time="year",
frontier_type="production",
)
result = model.fit(verbose=True)
# Efficiency decomposition
te_persistent = result.persistent_efficiency()
te_transient = result.transient_efficiency()
te_overall = result.overall_efficiency()
print(f"Mean persistent efficiency: {te_persistent['persistent_efficiency'].mean():.4f}")
print(f"Mean transient efficiency: {te_transient['transient_efficiency'].mean():.4f}")
print(f"Mean overall efficiency: {te_overall['overall_efficiency'].mean():.4f}")
When to Use¶
- You want to distinguish structural (persistent) from managerial (transient) inefficiency
- Policy recommendations need to differentiate between long-run reforms and short-run management improvements
- You need to understand why overall efficiency is low -- is it structural or managerial?
- Your panel has sufficient time periods (\(T \geq 5\)) for reliable within-entity estimation
Key Assumptions
- Requires panel data with both entity and time identifiers
- Persistent inefficiency \(\eta_i\) is assumed time-invariant (entity-specific)
- Transient inefficiency \(u_{it}\) is assumed half-normally distributed
- Random heterogeneity \(\mu_i\) is assumed normally distributed
- Sufficient within-entity variation is needed for identification
The Four-Component Model¶
Model Specification¶
| Component | Symbol | Distribution | Interpretation |
|---|---|---|---|
| Random heterogeneity | \(\mu_i\) | \(N(0, \sigma_\mu^2)\) | Technology differences, unobserved firm characteristics |
| Persistent inefficiency | \(\eta_i\) | \(N^+(0, \sigma_\eta^2)\) | Structural, long-run underperformance |
| Random noise | \(v_{it}\) | \(N(0, \sigma_v^2)\) | Weather, measurement error, luck |
| Transient inefficiency | \(u_{it}\) | \(N^+(0, \sigma_u^2)\) | Short-run managerial inefficiency |
Three-Step Estimation¶
The four-component model is estimated sequentially:
Step 1: Within (FE) Estimator
Demean data within entities to estimate \(\beta\) and recover entity-level effects \(\alpha_i\) and time-varying residuals \(\varepsilon_{it}\):
where \(\alpha_i = \mu_i - \eta_i\) and \(\varepsilon_{it} = v_{it} - u_{it}\).
Step 2: Separate Transient Inefficiency
Apply cross-sectional SFA to the residuals \(\hat{\varepsilon}_{it}\) to separate \(v_{it}\) and \(u_{it}\):
Uses half-normal MLE and JLMS estimator for \(\hat{u}_{it}\).
Step 3: Separate Persistent Inefficiency
Apply cross-sectional SFA to the estimated fixed effects \(\hat{\alpha}_i\) to separate \(\mu_i\) and \(\eta_i\):
Uses half-normal MLE and JLMS estimator for \(\hat{\eta}_i\).
Efficiency Types¶
Persistent efficiency (structural capability):
Transient efficiency (short-run management):
Overall efficiency (combined):
Detailed Guide¶
Full Estimation Example¶
from panelbox.frontier.advanced import FourComponentSFA
# Fit the model
model = FourComponentSFA(
data=panel_df,
depvar="log_output",
exog=["log_labor", "log_capital"],
entity="firm_id",
time="year",
frontier_type="production",
)
result = model.fit(verbose=False)
# Print full summary
result.print_summary()
The print_summary() method displays:
- Sample information (observations, entities, periods)
- Variance components (\(\sigma_v^2, \sigma_u^2, \sigma_\mu^2, \sigma_\eta^2\)) and their shares
- Efficiency summary statistics (persistent, transient, overall)
Variance Components¶
# Individual variance components
print(f"sigma_v (noise): {result.sigma_v:.4f}")
print(f"sigma_u (transient ineff.): {result.sigma_u:.4f}")
print(f"sigma_mu (heterogeneity): {result.sigma_mu:.4f}")
print(f"sigma_eta (persistent ineff.): {result.sigma_eta:.4f}")
# Variance shares
total_var = (result.sigma_v**2 + result.sigma_u**2
+ result.sigma_mu**2 + result.sigma_eta**2)
print(f"\nVariance shares:")
print(f" Noise: {100 * result.sigma_v**2 / total_var:.1f}%")
print(f" Transient ineff.: {100 * result.sigma_u**2 / total_var:.1f}%")
print(f" Heterogeneity: {100 * result.sigma_mu**2 / total_var:.1f}%")
print(f" Persistent ineff.: {100 * result.sigma_eta**2 / total_var:.1f}%")
Efficiency Decomposition¶
# Get all efficiency types
te_p = result.persistent_efficiency()
te_t = result.transient_efficiency()
te_o = result.overall_efficiency()
# Full decomposition table
decomp = result.decomposition()
print(decomp.head())
# Columns: entity, time, mu_i, eta_i, u_it, v_it
# Overall efficiency with both components
print(te_o.head())
# Columns: entity, time, overall_efficiency,
# persistent_efficiency, transient_efficiency
Bootstrap Confidence Intervals¶
boot = result.bootstrap(
n_bootstrap=100,
confidence_level=0.95,
random_state=42,
verbose=True,
)
# Persistent efficiency with CIs
pers_ci = boot.persistent_efficiency_ci()
print(pers_ci.head())
# Columns: entity, persistent_efficiency, ci_lower, ci_upper
# Variance component CIs
boot.print_summary()
Policy Implications¶
The decomposition directly informs policy:
| Scenario | Persistent TE | Transient TE | Policy Response |
|---|---|---|---|
| High / High | Structurally efficient, well-managed | Maintain current practices | |
| High / Low | Good structure, poor daily management | Management training, operational improvements | |
| Low / High | Structural problems, good management | Institutional reforms, technology upgrades | |
| Low / Low | Both structural and managerial issues | Comprehensive restructuring |
Configuration Options¶
FourComponentSFA Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
data |
DataFrame |
required | Panel DataFrame |
depvar |
str |
required | Dependent variable (in logs) |
exog |
list[str] |
required | Exogenous variable names |
entity |
str |
required | Entity identifier column |
time |
str |
required | Time identifier column |
frontier_type |
str |
"production" |
"production" or "cost" |
fit() Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
verbose |
bool |
False |
Print step-by-step estimation progress |
FourComponentResult Methods¶
| Method | Returns | Description |
|---|---|---|
persistent_efficiency() |
DataFrame |
\(TE_{p,i} = \exp(-\eta_i)\) per entity |
transient_efficiency() |
DataFrame |
\(TE_{t,it} = \exp(-u_{it})\) per observation |
overall_efficiency() |
DataFrame |
\(TE_{o,it} = TE_p \times TE_t\) with both components |
efficiency(estimator="bc") |
DataFrame |
Overall efficiency (API compatibility) |
decomposition() |
DataFrame |
Full 4-component decomposition table |
print_summary() |
None |
Display formatted results |
tfp_decomposition(periods) |
TFPDecomposition |
TFP growth decomposition |
bootstrap(n_bootstrap, ...) |
BootstrapResult |
Bootstrap confidence intervals |
Tutorials¶
| Tutorial | Description | Link |
|---|---|---|
| Four-Component SFA | Persistent vs transient efficiency analysis |
See Also¶
- Production and Cost Frontiers -- SFA fundamentals
- Panel SFA Models -- Classical panel models
- True Models (TFE/TRE) -- Two/three-component heterogeneity separation
- TFP Decomposition -- Productivity growth analysis using SFA results
- SFA Diagnostics -- Model validation and testing
References¶
- Kumbhakar, S. C., Lien, G., & Hardaker, J. B. (2014). Technical efficiency in competing panel data models: a study of Norwegian grain farming. Journal of Productivity Analysis, 41(2), 321-337.
- Colombi, R., Kumbhakar, S. C., Martini, G., & Vittadini, G. (2014). Closed-skew normality in stochastic frontiers with individual effects and long/short-run efficiency. Journal of Productivity Analysis, 42, 123-136.
- Kumbhakar, S. C., & Lovell, C. A. K. (2000). Stochastic Frontier Analysis. Cambridge University Press.
- Greene, W. H. (2005). Reconsidering heterogeneity in panel data estimators of the stochastic frontier model. Journal of Econometrics, 126(2), 269-303.