Frontier (SFA) API Reference¶
Module
Import: from panelbox.frontier import StochasticFrontier, FourComponentSFA, SFResult
Source: panelbox/frontier/
Overview¶
The frontier module implements Stochastic Frontier Analysis (SFA) for estimating production and cost frontiers with maximum likelihood estimation:
| Class | Description | Reference |
|---|---|---|
StochasticFrontier |
Main SFA model with multiple distributions | Aigner et al. (1977) |
FourComponentSFA |
Persistent + transient inefficiency decomposition | Kumbhakar et al. (2014) |
| True FE/RE functions | Separate heterogeneity from inefficiency | Greene (2005) |
Model Structure¶
Production frontier: ln(y_i) = x_i' * beta + v_i - u_i
Cost frontier: ln(y_i) = x_i' * beta + v_i + u_i
Where v_i ~ N(0, sigma_v^2) is noise and u_i >= 0 is inefficiency.
Enumerations¶
FrontierType¶
from panelbox.frontier import FrontierType
FrontierType.PRODUCTION # Inefficiency reduces output
FrontierType.COST # Inefficiency increases cost
DistributionType¶
from panelbox.frontier import DistributionType
DistributionType.HALF_NORMAL # Half-normal (Aigner et al. 1977)
DistributionType.EXPONENTIAL # Exponential (Meeusen & van den Broeck 1977)
DistributionType.TRUNCATED_NORMAL # Truncated normal with location parameter
DistributionType.GAMMA # Gamma distribution (Greene 1990)
ModelType¶
from panelbox.frontier import ModelType
ModelType.CROSS_SECTION # No panel structure
ModelType.POOLED # Pooled panel
ModelType.PITT_LEE # Time-invariant inefficiency (1981)
ModelType.BATTESE_COELLI_92 # Time-varying inefficiency (1992)
ModelType.BATTESE_COELLI_95 # With heterogeneity variables (1995)
Classes¶
StochasticFrontier¶
Main model class for Stochastic Frontier Analysis.
Constructor¶
StochasticFrontier(
data: pd.DataFrame,
depvar: str,
exog: list[str],
entity: str | None = None,
time: str | None = None,
frontier: str | FrontierType = "production",
dist: str | DistributionType = "half_normal",
inefficiency_vars: list[str] | None = None,
het_vars: list[str] | None = None,
model_type: str | ModelType | None = None,
css_time_trend: str | None = None,
)
| Parameter | Type | Default | Description |
|---|---|---|---|
data |
pd.DataFrame |
required | Panel or cross-section data |
depvar |
str |
required | Dependent variable (typically log of output/cost) |
exog |
list[str] |
required | Exogenous regressors (inputs) |
entity |
str \| None |
None |
Entity column (required for panel models) |
time |
str \| None |
None |
Time column (required for panel models) |
frontier |
str \| FrontierType |
"production" |
Frontier type: "production" or "cost" |
dist |
str \| DistributionType |
"half_normal" |
Inefficiency distribution |
inefficiency_vars |
list[str] \| None |
None |
Variables affecting mean inefficiency |
het_vars |
list[str] \| None |
None |
Variables affecting error variance |
model_type |
str \| ModelType \| None |
None |
Panel model type (auto-detected if None) |
css_time_trend |
str \| None |
None |
Time trend specification for CSS model |
Methods¶
.fit()¶
Estimate the frontier model via maximum likelihood.
Returns: SFResult
Example¶
from panelbox.frontier import StochasticFrontier
sf = StochasticFrontier(
data=df,
depvar="log_output",
exog=["log_labor", "log_capital"],
entity="firm_id",
time="year",
frontier="production",
dist="half_normal",
)
result = sf.fit()
print(result.summary())
# Get efficiency scores
eff = result.efficiency(estimator="bc")
print(f"Mean efficiency: {eff.mean():.4f}")
FourComponentSFA¶
Four-Component SFA model separating persistent and transient inefficiency. Decomposes the error into:
Where \(\eta_i\) is persistent and \(u_{it}\) is transient inefficiency.
Constructor¶
| Parameter | Type | Default | Description |
|---|---|---|---|
data |
pd.DataFrame |
required | Panel data |
depvar |
str |
required | Dependent variable |
exog |
list[str] |
required | Exogenous regressors |
entity |
str |
required | Entity column |
time |
str |
required | Time column |
Example¶
from panelbox.frontier import FourComponentSFA
fc = FourComponentSFA(
data=panel_df,
depvar="log_output",
exog=["log_labor", "log_capital"],
entity="firm_id",
time="year",
)
result = fc.fit()
# Decomposed efficiency
persistent = result.persistent_efficiency()
transient = result.transient_efficiency()
overall = result.overall_efficiency()
Result Classes¶
SFResult¶
Result container for StochasticFrontier estimation.
Key Attributes¶
| Attribute | Type | Description |
|---|---|---|
params |
np.ndarray |
Estimated coefficients |
param_names |
list |
Parameter names |
se |
np.ndarray |
Standard errors |
tvalues |
np.ndarray |
t-statistics |
pvalues |
np.ndarray |
p-values |
loglik |
float |
Log-likelihood |
aic |
float |
Akaike Information Criterion |
bic |
float |
Bayesian Information Criterion |
sigma_v |
float |
Noise standard deviation |
sigma_u |
float |
Inefficiency standard deviation |
lambda_param |
float |
lambda = sigma_u / sigma_v |
gamma |
float |
gamma = sigma_u^2 / (sigma_v^2 + sigma_u^2) |
converged |
bool |
Convergence status |
Methods¶
.efficiency(estimator="bc", ci_level=0.95)¶
Compute technical efficiency scores.
| Parameter | Type | Default | Description |
|---|---|---|---|
estimator |
str |
"bc" |
Estimator: "bc" (Battese-Coelli), "jlms" (Jondrow et al.), "mode" |
ci_level |
float |
0.95 |
Confidence interval level |
Returns: Efficiency scores in (0, 1] for production, [1, inf) for cost.
.summary()¶
Print formatted estimation results.
.compare_distributions()¶
Compare different distributional assumptions.
FourComponentResult¶
Result container for FourComponentSFA.
Methods¶
.persistent_efficiency()— Persistent (time-invariant) efficiency.transient_efficiency()— Transient (time-varying) efficiency.overall_efficiency()— Overall efficiency (persistent x transient)
True Model Functions¶
Functions for True Fixed Effects (TFE) and True Random Effects (TRE) models that separate unobserved heterogeneity from inefficiency (Greene 2005).
loglik_true_fixed_effects¶
from panelbox.frontier import loglik_true_fixed_effects
ll = loglik_true_fixed_effects(params, y, X, groups, dist="half_normal")
loglik_true_random_effects¶
from panelbox.frontier import loglik_true_random_effects
ll = loglik_true_random_effects(params, y, X, groups, dist="half_normal")
Bias Correction¶
from panelbox.frontier import bias_correct_tfe_analytical, bias_correct_tfe_jackknife
# Analytical correction
corrected = bias_correct_tfe_analytical(params, data)
# Jackknife correction
corrected = bias_correct_tfe_jackknife(params, data)
Variance Decomposition¶
from panelbox.frontier import variance_decomposition_tre
decomp = variance_decomposition_tre(result)
Statistical Tests¶
hausman_test_tfe_tre¶
Hausman test for choosing between TFE and TRE models.
from panelbox.frontier import hausman_test_tfe_tre
h_result = hausman_test_tfe_tre(tfe_result, tre_result)
lr_test¶
Likelihood ratio test for nested frontier models.
wald_test¶
Wald test for parameter restrictions.
inefficiency_presence_test¶
Test whether inefficiency is statistically significant (sigma_u > 0).
skewness_test¶
Test for skewness in OLS residuals (necessary condition for SFA).
vuong_test¶
Vuong (1989) test for comparing non-nested frontier models.
Utility Functions¶
add_translog¶
Generate translog terms (squares and interactions) for frontier estimation.
from panelbox.frontier import add_translog
df_translog = add_translog(df, variables=["log_labor", "log_capital"])
prepare_panel_index¶
Set up panel index (entity, time) for frontier data.
from panelbox.frontier import prepare_panel_index
df = prepare_panel_index(df, entity="firm_id", time="year")
validate_frontier_data¶
Validate data meets SFA requirements.
from panelbox.frontier import validate_frontier_data
validate_frontier_data(df, depvar="log_output", exog=["log_labor", "log_capital"])
See Also¶
- Tutorials: Frontier — Step-by-step SFA guide
- Theory: Frontier — SFA theory and derivations