Skip to content

Stochastic Frontier Theory --- Efficiency Measurement

Key Takeaway

Stochastic Frontier Analysis (SFA) decomposes the error term into symmetric noise (\(v\)) and one-sided inefficiency (\(u \geq 0\)), allowing estimation of technical efficiency relative to a production or cost frontier. Panel data enables separation of time-invariant heterogeneity from time-varying inefficiency.

Motivation

Standard regression estimates the average relationship between inputs and outputs. But economic agents may not achieve the maximum output possible given their inputs --- they may be inefficient. SFA provides a framework to:

  • Estimate the production frontier (maximum achievable output)
  • Measure individual technical efficiency relative to the frontier
  • Distinguish between random noise and systematic inefficiency
  • Track efficiency changes over time

The Production Frontier

Model Specification

In logarithmic form, the stochastic production frontier is:

\[ y_{it} = X_{it}'\beta + v_{it} - u_{it} \]

where:

  • \(y_{it} = \ln(\text{output}_{it})\): log output
  • \(X_{it}'\beta\): the deterministic frontier (maximum achievable output given inputs)
  • \(v_{it} \sim N(0, \sigma_v^2)\): symmetric noise (weather, measurement error, luck)
  • \(u_{it} \geq 0\): one-sided inefficiency term (distance below the frontier)

The composite error is \(\varepsilon_{it} = v_{it} - u_{it}\), which is negatively skewed because \(u_{it}\) shifts observations below the frontier.

Technical Efficiency

Technical efficiency for entity \(i\) at time \(t\) is defined as:

\[ TE_{it} = \frac{y_{it}^{\text{observed}}}{y_{it}^{\text{frontier}}} = \exp(-u_{it}) \in (0, 1] \]
  • \(TE_{it} = 1\): the entity operates on the frontier (fully efficient)
  • \(TE_{it} < 1\): the entity operates below the frontier (inefficient)
  • Example: \(TE = 0.85\) means the entity produces 85% of the maximum possible output

The Cost Frontier

Dual Specification

For the cost frontier, the sign of inefficiency is reversed:

\[ y_{it} = X_{it}'\beta + v_{it} + u_{it} \]

where \(y_{it} = \ln(\text{cost}_{it})\) and \(X_{it}\) includes log output, input prices, and other cost determinants.

Aspect Production Frontier Cost Frontier
Frontier represents Maximum output Minimum cost
Sign of \(u\) in model Negative (\(-u\)) Positive (\(+u\))
Inefficiency effect Reduces output below frontier Increases cost above frontier
Efficiency formula \(TE = e^{-u}\) \(CE = e^{-u}\) (PanelBox default)
Expected residual skewness Negative Positive

Sign Convention

Getting the sign convention wrong produces nonsensical results. Always verify: (1) the frontier type matches your dependent variable, (2) OLS residual skewness has the expected sign, and (3) efficiency scores are in \((0, 1]\).

Error Decomposition

Composite Error

The composite error \(\varepsilon_{it} = v_{it} - u_{it}\) (production) or \(\varepsilon_{it} = v_{it} + u_{it}\) (cost) has a skewed distribution that enables identification of the two components.

Distributional Assumptions

Noise term: Always \(v_{it} \sim N(0, \sigma_v^2)\) (symmetric, normal).

Inefficiency term --- several distributional choices:

Distribution \(u_{it}\) Parameters Properties
Half-normal \(\lvert N(0, \sigma_u^2) \rvert\) \(\sigma_u\) Mode at zero; most parsimonious
Truncated normal \(N^+(\mu, \sigma_u^2)\) \(\mu, \sigma_u\) Mode at \(\mu\) if \(\mu > 0\); more flexible
Exponential \(\text{Exp}(\sigma_u)\) \(\sigma_u\) Mode at zero; lighter tail than half-normal
Gamma \(\Gamma(P, \sigma_u)\) \(P, \sigma_u\) Most flexible; harder to estimate

The half-normal distribution is the most common default. The truncated normal nests the half-normal (when \(\mu = 0\)).

Signal-to-Noise Ratio

\[ \lambda = \frac{\sigma_u}{\sigma_v} \]
  • \(\lambda \to 0\): Noise dominates (no inefficiency detected)
  • \(\lambda \to \infty\): Inefficiency dominates (deterministic frontier)

Panel SFA Models

Time-Invariant Inefficiency (Pitt-Lee 1981)

\[ y_{it} = X_{it}'\beta + v_{it} - u_i, \quad u_i \geq 0 \]

Inefficiency \(u_i\) is constant over time for each entity. This is restrictive but uses all \(T\) observations to estimate each \(u_i\), improving precision.

Time-Varying Inefficiency (Battese-Coelli 1992)

\[ y_{it} = X_{it}'\beta + v_{it} - u_{it}, \quad u_{it} = \eta(t) \cdot u_i \]

where \(\eta(t) = \exp[-\eta(t - T)]\) is a time decay function:

  • \(\eta > 0\): Efficiency improves over time (inefficiency decays)
  • \(\eta < 0\): Efficiency deteriorates
  • \(\eta = 0\): Time-invariant (reduces to Pitt-Lee)

Determinants of Inefficiency (Battese-Coelli 1995)

\[ u_{it} \sim N^+(z_{it}'\delta, \sigma_u^2) \]

Inefficiency depends on observable variables \(z_{it}\) (e.g., manager education, ownership structure, regulatory environment). This allows direct estimation of what drives efficiency differences.

True Fixed/Random Effects (Greene 2005)

The Problem with Earlier Models

In Pitt-Lee and Battese-Coelli models, time-invariant heterogeneity (e.g., geographic advantages, firm culture) is confounded with persistent inefficiency. A firm in a favorable location appears more efficient, even if it is not.

True Fixed Effects

\[ y_{it} = \alpha_i + X_{it}'\beta + v_{it} - u_{it} \]

By including entity-specific intercepts \(\alpha_i\), heterogeneity is separated from inefficiency:

  • \(\alpha_i\) captures time-invariant differences (geography, technology type)
  • \(u_{it}\) captures genuine time-varying inefficiency

True Random Effects

\[ y_{it} = (\bar{\alpha} + w_i) + X_{it}'\beta + v_{it} - u_{it}, \quad w_i \sim N(0, \sigma_w^2) \]

The random effect \(w_i\) absorbs heterogeneity while \(u_{it}\) remains the inefficiency measure. Estimation uses simulated maximum likelihood or Gauss-Hermite quadrature.

Four-Component Model

Motivation

Even True FE/RE models may not fully separate persistent from transient inefficiency. The four-component model (Colombi et al. 2014, Kumbhakar et al. 2014) provides a complete decomposition:

\[ \varepsilon_{it} = \mu_i - \eta_i + v_{it} - u_{it} \]

where:

Component Symbol Nature Interpretation
Firm heterogeneity \(\mu_i\) Time-invariant, symmetric Unobserved advantages/disadvantages
Persistent inefficiency \(\eta_i \geq 0\) Time-invariant, one-sided Structural inefficiency (hard to change)
Noise \(v_{it}\) Time-varying, symmetric Random shocks
Transient inefficiency \(u_{it} \geq 0\) Time-varying, one-sided Correctable inefficiency

Efficiency Measures

  • Overall efficiency: \(TE_{it} = \exp(-\eta_i - u_{it})\)
  • Persistent efficiency: \(PE_i = \exp(-\eta_i)\)
  • Transient efficiency: \(TE_{it}^R = \exp(-u_{it})\)

This decomposition is valuable for policy: persistent inefficiency requires structural reforms, while transient inefficiency can be addressed through short-term management improvements.

Estimation

The four-component model is estimated in stages:

  1. Estimate the panel model and obtain entity-level and time-varying residuals
  2. Decompose entity-level residuals into \(\mu_i\) and \(\eta_i\) using the method of moments
  3. Decompose time-varying residuals into \(v_{it}\) and \(u_{it}\)

Efficiency Estimation

Jondrow et al. (1982) — JLMS Estimator

The conditional distribution of \(u_{it}\) given \(\varepsilon_{it}\) is:

\[ (u_{it} \mid \varepsilon_{it}) \sim N^+\left(\frac{-\varepsilon_{it}\sigma_u^2}{\sigma^2}, \frac{\sigma_v^2\sigma_u^2}{\sigma^2}\right) \]

where \(\sigma^2 = \sigma_v^2 + \sigma_u^2\). The point estimate is:

\[ E[u_{it} \mid \varepsilon_{it}] = \frac{\sigma_u \sigma_v}{\sigma}\left[\frac{\phi(\varepsilon_{it}\lambda/\sigma)}{\Phi(-\varepsilon_{it}\lambda/\sigma)} - \frac{\varepsilon_{it}\lambda}{\sigma}\right] \]

Battese-Coelli (1988) — BC Estimator

\[ TE_{it} = E[\exp(-u_{it}) \mid \varepsilon_{it}] = \frac{\Phi(-\sigma_* + \mu_{*it}/\sigma_*)}{\Phi(\mu_{*it}/\sigma_*)} \exp\left(-\mu_{*it} + \frac{\sigma_*^2}{2}\right) \]

This is the recommended estimator for technical efficiency, as it directly estimates \(E[e^{-u} \mid \varepsilon]\) rather than using \(e^{-E[u \mid \varepsilon]}\).

TFP Decomposition

Total Factor Productivity (TFP) growth can be decomposed using SFA results:

\[ \Delta \ln TFP = \underbrace{\frac{\partial f}{\partial t}}_{\text{Technical change}} + \underbrace{\Delta TE}_{\text{Efficiency change}} + \underbrace{(RTS - 1) \cdot \sum_k \frac{\partial f}{\partial x_k}\Delta \ln x_k}_{\text{Scale effect}} \]

where \(RTS = \sum_k \partial f / \partial x_k\) is the returns-to-scale measure.

Component Meaning
Technical change Frontier shift over time
Efficiency change Movement toward/away from frontier
Scale effect Gains/losses from being at suboptimal scale

Practical Implications

  1. Check residual skewness before estimation --- wrong sign indicates incorrect frontier type or absence of inefficiency
  2. Start with half-normal distribution, then test truncated normal
  3. Use True FE/RE when heterogeneity is a concern (most applications)
  4. Four-component model provides the richest decomposition but requires sufficient \(N\) and \(T\)
  5. Report BC efficiency estimates (\(E[e^{-u} \mid \varepsilon]\)) rather than JLMS (\(e^{-E[u \mid \varepsilon]}\))
  6. Validate by checking: efficiency in \((0,1]\), reasonable mean (~0.6--0.9), expected skewness sign

Key References

  • Aigner, D., Lovell, C.A.K. & Schmidt, P. (1977). "Formulation and Estimation of Stochastic Frontier Production Function Models." Journal of Econometrics, 6(1), 21--37. --- Original SFA formulation.
  • Meeusen, W. & van den Broeck, J. (1977). "Efficiency Estimation from Cobb-Douglas Production Functions with Composed Error." International Economic Review, 18(2), 435--444. --- Independent development of SFA.
  • Jondrow, J., Lovell, C.A.K., Materov, I.S. & Schmidt, P. (1982). "On the Estimation of Technical Inefficiency in the Stochastic Frontier Production Function Model." Journal of Econometrics, 19(2--3), 233--238. --- JLMS efficiency estimator.
  • Battese, G.E. & Coelli, T.J. (1992). "Frontier Production Functions, Technical Efficiency and Panel Data." Journal of Productivity Analysis, 3, 153--169. --- Time-varying inefficiency with decay.
  • Greene, W.H. (2005). "Reconsidering Heterogeneity in Panel Data Estimators of the Stochastic Frontier Model." Journal of Econometrics, 126(2), 269--303. --- True FE/RE models.
  • Kumbhakar, S.C., Lien, G. & Hardaker, J.B. (2014). "Technical Efficiency in Competing Panel Data Models: A Study of Norwegian Grain Farming." Journal of Productivity Analysis, 41, 321--337. --- Four-component model.
  • Kumbhakar, S.C. & Lovell, C.A.K. (2000). Stochastic Frontier Analysis. Cambridge University Press. --- Comprehensive textbook.

See Also