Stochastic Frontier Theory --- Efficiency Measurement¶
Key Takeaway
Stochastic Frontier Analysis (SFA) decomposes the error term into symmetric noise (\(v\)) and one-sided inefficiency (\(u \geq 0\)), allowing estimation of technical efficiency relative to a production or cost frontier. Panel data enables separation of time-invariant heterogeneity from time-varying inefficiency.
Motivation¶
Standard regression estimates the average relationship between inputs and outputs. But economic agents may not achieve the maximum output possible given their inputs --- they may be inefficient. SFA provides a framework to:
- Estimate the production frontier (maximum achievable output)
- Measure individual technical efficiency relative to the frontier
- Distinguish between random noise and systematic inefficiency
- Track efficiency changes over time
The Production Frontier¶
Model Specification¶
In logarithmic form, the stochastic production frontier is:
where:
- \(y_{it} = \ln(\text{output}_{it})\): log output
- \(X_{it}'\beta\): the deterministic frontier (maximum achievable output given inputs)
- \(v_{it} \sim N(0, \sigma_v^2)\): symmetric noise (weather, measurement error, luck)
- \(u_{it} \geq 0\): one-sided inefficiency term (distance below the frontier)
The composite error is \(\varepsilon_{it} = v_{it} - u_{it}\), which is negatively skewed because \(u_{it}\) shifts observations below the frontier.
Technical Efficiency¶
Technical efficiency for entity \(i\) at time \(t\) is defined as:
- \(TE_{it} = 1\): the entity operates on the frontier (fully efficient)
- \(TE_{it} < 1\): the entity operates below the frontier (inefficient)
- Example: \(TE = 0.85\) means the entity produces 85% of the maximum possible output
The Cost Frontier¶
Dual Specification¶
For the cost frontier, the sign of inefficiency is reversed:
where \(y_{it} = \ln(\text{cost}_{it})\) and \(X_{it}\) includes log output, input prices, and other cost determinants.
| Aspect | Production Frontier | Cost Frontier |
|---|---|---|
| Frontier represents | Maximum output | Minimum cost |
| Sign of \(u\) in model | Negative (\(-u\)) | Positive (\(+u\)) |
| Inefficiency effect | Reduces output below frontier | Increases cost above frontier |
| Efficiency formula | \(TE = e^{-u}\) | \(CE = e^{-u}\) (PanelBox default) |
| Expected residual skewness | Negative | Positive |
Sign Convention
Getting the sign convention wrong produces nonsensical results. Always verify: (1) the frontier type matches your dependent variable, (2) OLS residual skewness has the expected sign, and (3) efficiency scores are in \((0, 1]\).
Error Decomposition¶
Composite Error¶
The composite error \(\varepsilon_{it} = v_{it} - u_{it}\) (production) or \(\varepsilon_{it} = v_{it} + u_{it}\) (cost) has a skewed distribution that enables identification of the two components.
Distributional Assumptions¶
Noise term: Always \(v_{it} \sim N(0, \sigma_v^2)\) (symmetric, normal).
Inefficiency term --- several distributional choices:
| Distribution | \(u_{it}\) | Parameters | Properties |
|---|---|---|---|
| Half-normal | \(\lvert N(0, \sigma_u^2) \rvert\) | \(\sigma_u\) | Mode at zero; most parsimonious |
| Truncated normal | \(N^+(\mu, \sigma_u^2)\) | \(\mu, \sigma_u\) | Mode at \(\mu\) if \(\mu > 0\); more flexible |
| Exponential | \(\text{Exp}(\sigma_u)\) | \(\sigma_u\) | Mode at zero; lighter tail than half-normal |
| Gamma | \(\Gamma(P, \sigma_u)\) | \(P, \sigma_u\) | Most flexible; harder to estimate |
The half-normal distribution is the most common default. The truncated normal nests the half-normal (when \(\mu = 0\)).
Signal-to-Noise Ratio¶
- \(\lambda \to 0\): Noise dominates (no inefficiency detected)
- \(\lambda \to \infty\): Inefficiency dominates (deterministic frontier)
Panel SFA Models¶
Time-Invariant Inefficiency (Pitt-Lee 1981)¶
Inefficiency \(u_i\) is constant over time for each entity. This is restrictive but uses all \(T\) observations to estimate each \(u_i\), improving precision.
Time-Varying Inefficiency (Battese-Coelli 1992)¶
where \(\eta(t) = \exp[-\eta(t - T)]\) is a time decay function:
- \(\eta > 0\): Efficiency improves over time (inefficiency decays)
- \(\eta < 0\): Efficiency deteriorates
- \(\eta = 0\): Time-invariant (reduces to Pitt-Lee)
Determinants of Inefficiency (Battese-Coelli 1995)¶
Inefficiency depends on observable variables \(z_{it}\) (e.g., manager education, ownership structure, regulatory environment). This allows direct estimation of what drives efficiency differences.
True Fixed/Random Effects (Greene 2005)¶
The Problem with Earlier Models¶
In Pitt-Lee and Battese-Coelli models, time-invariant heterogeneity (e.g., geographic advantages, firm culture) is confounded with persistent inefficiency. A firm in a favorable location appears more efficient, even if it is not.
True Fixed Effects¶
By including entity-specific intercepts \(\alpha_i\), heterogeneity is separated from inefficiency:
- \(\alpha_i\) captures time-invariant differences (geography, technology type)
- \(u_{it}\) captures genuine time-varying inefficiency
True Random Effects¶
The random effect \(w_i\) absorbs heterogeneity while \(u_{it}\) remains the inefficiency measure. Estimation uses simulated maximum likelihood or Gauss-Hermite quadrature.
Four-Component Model¶
Motivation¶
Even True FE/RE models may not fully separate persistent from transient inefficiency. The four-component model (Colombi et al. 2014, Kumbhakar et al. 2014) provides a complete decomposition:
where:
| Component | Symbol | Nature | Interpretation |
|---|---|---|---|
| Firm heterogeneity | \(\mu_i\) | Time-invariant, symmetric | Unobserved advantages/disadvantages |
| Persistent inefficiency | \(\eta_i \geq 0\) | Time-invariant, one-sided | Structural inefficiency (hard to change) |
| Noise | \(v_{it}\) | Time-varying, symmetric | Random shocks |
| Transient inefficiency | \(u_{it} \geq 0\) | Time-varying, one-sided | Correctable inefficiency |
Efficiency Measures¶
- Overall efficiency: \(TE_{it} = \exp(-\eta_i - u_{it})\)
- Persistent efficiency: \(PE_i = \exp(-\eta_i)\)
- Transient efficiency: \(TE_{it}^R = \exp(-u_{it})\)
This decomposition is valuable for policy: persistent inefficiency requires structural reforms, while transient inefficiency can be addressed through short-term management improvements.
Estimation¶
The four-component model is estimated in stages:
- Estimate the panel model and obtain entity-level and time-varying residuals
- Decompose entity-level residuals into \(\mu_i\) and \(\eta_i\) using the method of moments
- Decompose time-varying residuals into \(v_{it}\) and \(u_{it}\)
Efficiency Estimation¶
Jondrow et al. (1982) — JLMS Estimator¶
The conditional distribution of \(u_{it}\) given \(\varepsilon_{it}\) is:
where \(\sigma^2 = \sigma_v^2 + \sigma_u^2\). The point estimate is:
Battese-Coelli (1988) — BC Estimator¶
This is the recommended estimator for technical efficiency, as it directly estimates \(E[e^{-u} \mid \varepsilon]\) rather than using \(e^{-E[u \mid \varepsilon]}\).
TFP Decomposition¶
Total Factor Productivity (TFP) growth can be decomposed using SFA results:
where \(RTS = \sum_k \partial f / \partial x_k\) is the returns-to-scale measure.
| Component | Meaning |
|---|---|
| Technical change | Frontier shift over time |
| Efficiency change | Movement toward/away from frontier |
| Scale effect | Gains/losses from being at suboptimal scale |
Practical Implications¶
- Check residual skewness before estimation --- wrong sign indicates incorrect frontier type or absence of inefficiency
- Start with half-normal distribution, then test truncated normal
- Use True FE/RE when heterogeneity is a concern (most applications)
- Four-component model provides the richest decomposition but requires sufficient \(N\) and \(T\)
- Report BC efficiency estimates (\(E[e^{-u} \mid \varepsilon]\)) rather than JLMS (\(e^{-E[u \mid \varepsilon]}\))
- Validate by checking: efficiency in \((0,1]\), reasonable mean (~0.6--0.9), expected skewness sign
Key References¶
- Aigner, D., Lovell, C.A.K. & Schmidt, P. (1977). "Formulation and Estimation of Stochastic Frontier Production Function Models." Journal of Econometrics, 6(1), 21--37. --- Original SFA formulation.
- Meeusen, W. & van den Broeck, J. (1977). "Efficiency Estimation from Cobb-Douglas Production Functions with Composed Error." International Economic Review, 18(2), 435--444. --- Independent development of SFA.
- Jondrow, J., Lovell, C.A.K., Materov, I.S. & Schmidt, P. (1982). "On the Estimation of Technical Inefficiency in the Stochastic Frontier Production Function Model." Journal of Econometrics, 19(2--3), 233--238. --- JLMS efficiency estimator.
- Battese, G.E. & Coelli, T.J. (1992). "Frontier Production Functions, Technical Efficiency and Panel Data." Journal of Productivity Analysis, 3, 153--169. --- Time-varying inefficiency with decay.
- Greene, W.H. (2005). "Reconsidering Heterogeneity in Panel Data Estimators of the Stochastic Frontier Model." Journal of Econometrics, 126(2), 269--303. --- True FE/RE models.
- Kumbhakar, S.C., Lien, G. & Hardaker, J.B. (2014). "Technical Efficiency in Competing Panel Data Models: A Study of Norwegian Grain Farming." Journal of Productivity Analysis, 41, 321--337. --- Four-component model.
- Kumbhakar, S.C. & Lovell, C.A.K. (2000). Stochastic Frontier Analysis. Cambridge University Press. --- Comprehensive textbook.
See Also¶
- Panel Fundamentals --- fixed and random effects foundations
- Quantile Theory --- distributional analysis beyond the mean
- References --- complete bibliography