Frontier (SFA) API Reference¶

Module

Import: from panelbox.frontier import StochasticFrontier, FourComponentSFA, SFResult Source: panelbox/frontier/

Overview¶

The frontier module implements Stochastic Frontier Analysis (SFA) for estimating production and cost frontiers with maximum likelihood estimation:

Class	Description	Reference
`StochasticFrontier`	Main SFA model with multiple distributions	Aigner et al. (1977)
`FourComponentSFA`	Persistent + transient inefficiency decomposition	Kumbhakar et al. (2014)
True FE/RE functions	Separate heterogeneity from inefficiency	Greene (2005)

Model Structure¶

Production frontier: ln(y_i) = x_i' * beta + v_i - u_i

Cost frontier: ln(y_i) = x_i' * beta + v_i + u_i

Where v_i ~ N(0, sigma_v^2) is noise and u_i >= 0 is inefficiency.

Enumerations¶

FrontierType¶

from panelbox.frontier import FrontierType

FrontierType.PRODUCTION  # Inefficiency reduces output
FrontierType.COST        # Inefficiency increases cost

DistributionType¶

from panelbox.frontier import DistributionType

DistributionType.HALF_NORMAL       # Half-normal (Aigner et al. 1977)
DistributionType.EXPONENTIAL       # Exponential (Meeusen & van den Broeck 1977)
DistributionType.TRUNCATED_NORMAL  # Truncated normal with location parameter
DistributionType.GAMMA             # Gamma distribution (Greene 1990)

ModelType¶

from panelbox.frontier import ModelType

ModelType.CROSS_SECTION     # No panel structure
ModelType.POOLED            # Pooled panel
ModelType.PITT_LEE          # Time-invariant inefficiency (1981)
ModelType.BATTESE_COELLI_92 # Time-varying inefficiency (1992)
ModelType.BATTESE_COELLI_95 # With heterogeneity variables (1995)

Classes¶

StochasticFrontier¶

Main model class for Stochastic Frontier Analysis.

Constructor¶

StochasticFrontier(
    data: pd.DataFrame,
    depvar: str,
    exog: list[str],
    entity: str | None = None,
    time: str | None = None,
    frontier: str | FrontierType = "production",
    dist: str | DistributionType = "half_normal",
    inefficiency_vars: list[str] | None = None,
    het_vars: list[str] | None = None,
    model_type: str | ModelType | None = None,
    css_time_trend: str | None = None,
)

Parameter	Type	Default	Description
`data`	`pd.DataFrame`	required	Panel or cross-section data
`depvar`	`str`	required	Dependent variable (typically log of output/cost)
`exog`	`list[str]`	required	Exogenous regressors (inputs)
`entity`	`str \\| None`	`None`	Entity column (required for panel models)
`time`	`str \\| None`	`None`	Time column (required for panel models)
`frontier`	`str \\| FrontierType`	`"production"`	Frontier type: `"production"` or `"cost"`
`dist`	`str \\| DistributionType`	`"half_normal"`	Inefficiency distribution
`inefficiency_vars`	`list[str] \\| None`	`None`	Variables affecting mean inefficiency
`het_vars`	`list[str] \\| None`	`None`	Variables affecting error variance
`model_type`	`str \\| ModelType \\| None`	`None`	Panel model type (auto-detected if None)
`css_time_trend`	`str \\| None`	`None`	Time trend specification for CSS model

Methods¶

`.fit()`¶

Estimate the frontier model via maximum likelihood.

result = model.fit()

Returns: SFResult

Example¶

from panelbox.frontier import StochasticFrontier

sf = StochasticFrontier(
    data=df,
    depvar="log_output",
    exog=["log_labor", "log_capital"],
    entity="firm_id",
    time="year",
    frontier="production",
    dist="half_normal",
)
result = sf.fit()
print(result.summary())

# Get efficiency scores
eff = result.efficiency(estimator="bc")
print(f"Mean efficiency: {eff.mean():.4f}")

FourComponentSFA¶

Four-Component SFA model separating persistent and transient inefficiency. Decomposes the error into:

\[y_{it} = x_{it}'\beta + (\mu_i + v_{it}) - (\eta_i + u_{it})\]

Where \(\eta_i\) is persistent and \(u_{it}\) is transient inefficiency.

Constructor¶

FourComponentSFA(
    data: pd.DataFrame,
    depvar: str,
    exog: list[str],
    entity: str,
    time: str,
)

Parameter	Type	Default	Description
`data`	`pd.DataFrame`	required	Panel data
`depvar`	`str`	required	Dependent variable
`exog`	`list[str]`	required	Exogenous regressors
`entity`	`str`	required	Entity column
`time`	`str`	required	Time column

Example¶

from panelbox.frontier import FourComponentSFA

fc = FourComponentSFA(
    data=panel_df,
    depvar="log_output",
    exog=["log_labor", "log_capital"],
    entity="firm_id",
    time="year",
)
result = fc.fit()

# Decomposed efficiency
persistent = result.persistent_efficiency()
transient = result.transient_efficiency()
overall = result.overall_efficiency()

Result Classes¶

SFResult¶

Result container for StochasticFrontier estimation.

Key Attributes¶

Attribute	Type	Description
`params`	`np.ndarray`	Estimated coefficients
`param_names`	`list`	Parameter names
`se`	`np.ndarray`	Standard errors
`tvalues`	`np.ndarray`	t-statistics
`pvalues`	`np.ndarray`	p-values
`loglik`	`float`	Log-likelihood
`aic`	`float`	Akaike Information Criterion
`bic`	`float`	Bayesian Information Criterion
`sigma_v`	`float`	Noise standard deviation
`sigma_u`	`float`	Inefficiency standard deviation
`lambda_param`	`float`	lambda = sigma_u / sigma_v
`gamma`	`float`	gamma = sigma_u^2 / (sigma_v^2 + sigma_u^2)
`converged`	`bool`	Convergence status

Methods¶

`.efficiency(estimator="bc", ci_level=0.95)`¶

Compute technical efficiency scores.

Parameter	Type	Default	Description
`estimator`	`str`	`"bc"`	Estimator: `"bc"` (Battese-Coelli), `"jlms"` (Jondrow et al.), `"mode"`
`ci_level`	`float`	`0.95`	Confidence interval level

Returns: Efficiency scores in (0, 1] for production, [1, inf) for cost.

`.summary()`¶

Print formatted estimation results.

`.compare_distributions()`¶

Compare different distributional assumptions.

FourComponentResult¶

Result container for FourComponentSFA.

Methods¶

.persistent_efficiency() — Persistent (time-invariant) efficiency
.transient_efficiency() — Transient (time-varying) efficiency
.overall_efficiency() — Overall efficiency (persistent x transient)

True Model Functions¶

Functions for True Fixed Effects (TFE) and True Random Effects (TRE) models that separate unobserved heterogeneity from inefficiency (Greene 2005).

loglik_true_fixed_effects¶

from panelbox.frontier import loglik_true_fixed_effects

ll = loglik_true_fixed_effects(params, y, X, groups, dist="half_normal")

loglik_true_random_effects¶

from panelbox.frontier import loglik_true_random_effects

ll = loglik_true_random_effects(params, y, X, groups, dist="half_normal")

Bias Correction¶

from panelbox.frontier import bias_correct_tfe_analytical, bias_correct_tfe_jackknife

# Analytical correction
corrected = bias_correct_tfe_analytical(params, data)

# Jackknife correction
corrected = bias_correct_tfe_jackknife(params, data)

Variance Decomposition¶

from panelbox.frontier import variance_decomposition_tre

decomp = variance_decomposition_tre(result)

Statistical Tests¶

hausman_test_tfe_tre¶

Hausman test for choosing between TFE and TRE models.

from panelbox.frontier import hausman_test_tfe_tre

h_result = hausman_test_tfe_tre(tfe_result, tre_result)

lr_test¶

Likelihood ratio test for nested frontier models.

from panelbox.frontier import lr_test

lr_result = lr_test(restricted_result, unrestricted_result)

wald_test¶

Wald test for parameter restrictions.

from panelbox.frontier import wald_test

w_result = wald_test(result, R, r)  # Test R * beta = r

inefficiency_presence_test¶

Test whether inefficiency is statistically significant (sigma_u > 0).

from panelbox.frontier import inefficiency_presence_test

test = inefficiency_presence_test(result)

skewness_test¶

Test for skewness in OLS residuals (necessary condition for SFA).

from panelbox.frontier import skewness_test

test = skewness_test(result)

vuong_test¶

Vuong (1989) test for comparing non-nested frontier models.

from panelbox.frontier import vuong_test

v_result = vuong_test(result1, result2)

Utility Functions¶

add_translog¶

Generate translog terms (squares and interactions) for frontier estimation.

from panelbox.frontier import add_translog

df_translog = add_translog(df, variables=["log_labor", "log_capital"])

prepare_panel_index¶

Set up panel index (entity, time) for frontier data.

from panelbox.frontier import prepare_panel_index

df = prepare_panel_index(df, entity="firm_id", time="year")

validate_frontier_data¶

Validate data meets SFA requirements.

from panelbox.frontier import validate_frontier_data

validate_frontier_data(df, depvar="log_output", exog=["log_labor", "log_capital"])

Frontier (SFA) API Reference¶

Overview¶

Model Structure¶

Enumerations¶

FrontierType¶

DistributionType¶

ModelType¶

Classes¶

StochasticFrontier¶

Constructor¶

Methods¶

.fit()¶

Example¶

FourComponentSFA¶

Constructor¶

Example¶

Result Classes¶

SFResult¶

Key Attributes¶

Methods¶

.efficiency(estimator="bc", ci_level=0.95)¶

.summary()¶

.compare_distributions()¶

FourComponentResult¶

Methods¶

True Model Functions¶

loglik_true_fixed_effects¶

loglik_true_random_effects¶

Bias Correction¶

Variance Decomposition¶

Statistical Tests¶

hausman_test_tfe_tre¶

lr_test¶

wald_test¶

inefficiency_presence_test¶

skewness_test¶

vuong_test¶

Utility Functions¶

add_translog¶

prepare_panel_index¶

validate_frontier_data¶

See Also¶

`.fit()`¶

`.efficiency(estimator="bc", ci_level=0.95)`¶

`.summary()`¶

`.compare_distributions()`¶