Stochastic Frontier Theory --- Efficiency Measurement¶

Key Takeaway

Stochastic Frontier Analysis (SFA) decomposes the error term into symmetric noise (\(v\)) and one-sided inefficiency (\(u \geq 0\)), allowing estimation of technical efficiency relative to a production or cost frontier. Panel data enables separation of time-invariant heterogeneity from time-varying inefficiency.

Motivation¶

Standard regression estimates the average relationship between inputs and outputs. But economic agents may not achieve the maximum output possible given their inputs --- they may be inefficient. SFA provides a framework to:

Estimate the production frontier (maximum achievable output)
Measure individual technical efficiency relative to the frontier
Distinguish between random noise and systematic inefficiency
Track efficiency changes over time

The Production Frontier¶

Model Specification¶

In logarithmic form, the stochastic production frontier is:

\[ y_{it} = X_{it}'\beta + v_{it} - u_{it} \]

where:

\(y_{it} = \ln(\text{output}_{it})\): log output
\(X_{it}'\beta\): the deterministic frontier (maximum achievable output given inputs)
\(v_{it} \sim N(0, \sigma_v^2)\): symmetric noise (weather, measurement error, luck)
\(u_{it} \geq 0\): one-sided inefficiency term (distance below the frontier)

The composite error is \(\varepsilon_{it} = v_{it} - u_{it}\), which is negatively skewed because \(u_{it}\) shifts observations below the frontier.

Technical Efficiency¶

Technical efficiency for entity \(i\) at time \(t\) is defined as:

\[ TE_{it} = \frac{y_{it}^{\text{observed}}}{y_{it}^{\text{frontier}}} = \exp(-u_{it}) \in (0, 1] \]

\(TE_{it} = 1\): the entity operates on the frontier (fully efficient)
\(TE_{it} < 1\): the entity operates below the frontier (inefficient)
Example: \(TE = 0.85\) means the entity produces 85% of the maximum possible output

The Cost Frontier¶

Dual Specification¶

For the cost frontier, the sign of inefficiency is reversed:

\[ y_{it} = X_{it}'\beta + v_{it} + u_{it} \]

where \(y_{it} = \ln(\text{cost}_{it})\) and \(X_{it}\) includes log output, input prices, and other cost determinants.

Aspect	Production Frontier	Cost Frontier
Frontier represents	Maximum output	Minimum cost
Sign of \(u\) in model	Negative (\(-u\))	Positive (\(+u\))
Inefficiency effect	Reduces output below frontier	Increases cost above frontier
Efficiency formula	\(TE = e^{-u}\)	\(CE = e^{-u}\) (PanelBox default)
Expected residual skewness	Negative	Positive

Sign Convention

Getting the sign convention wrong produces nonsensical results. Always verify: (1) the frontier type matches your dependent variable, (2) OLS residual skewness has the expected sign, and (3) efficiency scores are in \((0, 1]\).

Error Decomposition¶

Composite Error¶

The composite error \(\varepsilon_{it} = v_{it} - u_{it}\) (production) or \(\varepsilon_{it} = v_{it} + u_{it}\) (cost) has a skewed distribution that enables identification of the two components.

Distributional Assumptions¶

Noise term: Always \(v_{it} \sim N(0, \sigma_v^2)\) (symmetric, normal).

Inefficiency term --- several distributional choices:

Distribution	\(u_{it}\)	Parameters	Properties
Half-normal	\(\lvert N(0, \sigma_u^2) \rvert\)	\(\sigma_u\)	Mode at zero; most parsimonious
Truncated normal	\(N^+(\mu, \sigma_u^2)\)	\(\mu, \sigma_u\)	Mode at \(\mu\) if \(\mu > 0\); more flexible
Exponential	\(\text{Exp}(\sigma_u)\)	\(\sigma_u\)	Mode at zero; lighter tail than half-normal
Gamma	\(\Gamma(P, \sigma_u)\)	\(P, \sigma_u\)	Most flexible; harder to estimate

The half-normal distribution is the most common default. The truncated normal nests the half-normal (when \(\mu = 0\)).

Signal-to-Noise Ratio¶

\[ \lambda = \frac{\sigma_u}{\sigma_v} \]

\(\lambda \to 0\): Noise dominates (no inefficiency detected)
\(\lambda \to \infty\): Inefficiency dominates (deterministic frontier)

Panel SFA Models¶

Time-Invariant Inefficiency (Pitt-Lee 1981)¶

\[ y_{it} = X_{it}'\beta + v_{it} - u_i, \quad u_i \geq 0 \]

Inefficiency \(u_i\) is constant over time for each entity. This is restrictive but uses all \(T\) observations to estimate each \(u_i\), improving precision.

Time-Varying Inefficiency (Battese-Coelli 1992)¶

\[ y_{it} = X_{it}'\beta + v_{it} - u_{it}, \quad u_{it} = \eta(t) \cdot u_i \]

where \(\eta(t) = \exp[-\eta(t - T)]\) is a time decay function:

\(\eta > 0\): Efficiency improves over time (inefficiency decays)
\(\eta < 0\): Efficiency deteriorates
\(\eta = 0\): Time-invariant (reduces to Pitt-Lee)

Determinants of Inefficiency (Battese-Coelli 1995)¶

\[ u_{it} \sim N^+(z_{it}'\delta, \sigma_u^2) \]

Inefficiency depends on observable variables \(z_{it}\) (e.g., manager education, ownership structure, regulatory environment). This allows direct estimation of what drives efficiency differences.

True Fixed/Random Effects (Greene 2005)¶

The Problem with Earlier Models¶

In Pitt-Lee and Battese-Coelli models, time-invariant heterogeneity (e.g., geographic advantages, firm culture) is confounded with persistent inefficiency. A firm in a favorable location appears more efficient, even if it is not.

True Fixed Effects¶

\[ y_{it} = \alpha_i + X_{it}'\beta + v_{it} - u_{it} \]

By including entity-specific intercepts \(\alpha_i\), heterogeneity is separated from inefficiency:

\(\alpha_i\) captures time-invariant differences (geography, technology type)
\(u_{it}\) captures genuine time-varying inefficiency

True Random Effects¶

\[ y_{it} = (\bar{\alpha} + w_i) + X_{it}'\beta + v_{it} - u_{it}, \quad w_i \sim N(0, \sigma_w^2) \]

The random effect \(w_i\) absorbs heterogeneity while \(u_{it}\) remains the inefficiency measure. Estimation uses simulated maximum likelihood or Gauss-Hermite quadrature.

Four-Component Model¶

Motivation¶

Even True FE/RE models may not fully separate persistent from transient inefficiency. The four-component model (Colombi et al. 2014, Kumbhakar et al. 2014) provides a complete decomposition:

\[ \varepsilon_{it} = \mu_i - \eta_i + v_{it} - u_{it} \]

where:

Component	Symbol	Nature	Interpretation
Firm heterogeneity	\(\mu_i\)	Time-invariant, symmetric	Unobserved advantages/disadvantages
Persistent inefficiency	\(\eta_i \geq 0\)	Time-invariant, one-sided	Structural inefficiency (hard to change)
Noise	\(v_{it}\)	Time-varying, symmetric	Random shocks
Transient inefficiency	\(u_{it} \geq 0\)	Time-varying, one-sided	Correctable inefficiency

Efficiency Measures¶

Overall efficiency: \(TE_{it} = \exp(-\eta_i - u_{it})\)
Persistent efficiency: \(PE_i = \exp(-\eta_i)\)
Transient efficiency: \(TE_{it}^R = \exp(-u_{it})\)

This decomposition is valuable for policy: persistent inefficiency requires structural reforms, while transient inefficiency can be addressed through short-term management improvements.

Estimation¶

The four-component model is estimated in stages:

Estimate the panel model and obtain entity-level and time-varying residuals
Decompose entity-level residuals into \(\mu_i\) and \(\eta_i\) using the method of moments
Decompose time-varying residuals into \(v_{it}\) and \(u_{it}\)

Efficiency Estimation¶

Jondrow et al. (1982) — JLMS Estimator¶

The conditional distribution of \(u_{it}\) given \(\varepsilon_{it}\) is:

\[ (u_{it} \mid \varepsilon_{it}) \sim N^+\left(\frac{-\varepsilon_{it}\sigma_u^2}{\sigma^2}, \frac{\sigma_v^2\sigma_u^2}{\sigma^2}\right) \]

where \(\sigma^2 = \sigma_v^2 + \sigma_u^2\). The point estimate is:

\[ E[u_{it} \mid \varepsilon_{it}] = \frac{\sigma_u \sigma_v}{\sigma}\left[\frac{\phi(\varepsilon_{it}\lambda/\sigma)}{\Phi(-\varepsilon_{it}\lambda/\sigma)} - \frac{\varepsilon_{it}\lambda}{\sigma}\right] \]

Battese-Coelli (1988) — BC Estimator¶

\[ TE_{it} = E[\exp(-u_{it}) \mid \varepsilon_{it}] = \frac{\Phi(-\sigma_* + \mu_{*it}/\sigma_*)}{\Phi(\mu_{*it}/\sigma_*)} \exp\left(-\mu_{*it} + \frac{\sigma_*^2}{2}\right) \]

This is the recommended estimator for technical efficiency, as it directly estimates \(E[e^{-u} \mid \varepsilon]\) rather than using \(e^{-E[u \mid \varepsilon]}\).

TFP Decomposition¶

Total Factor Productivity (TFP) growth can be decomposed using SFA results:

\[ \Delta \ln TFP = \underbrace{\frac{\partial f}{\partial t}}_{\text{Technical change}} + \underbrace{\Delta TE}_{\text{Efficiency change}} + \underbrace{(RTS - 1) \cdot \sum_k \frac{\partial f}{\partial x_k}\Delta \ln x_k}_{\text{Scale effect}} \]

where \(RTS = \sum_k \partial f / \partial x_k\) is the returns-to-scale measure.

Component	Meaning
Technical change	Frontier shift over time
Efficiency change	Movement toward/away from frontier
Scale effect	Gains/losses from being at suboptimal scale

Practical Implications¶

Check residual skewness before estimation --- wrong sign indicates incorrect frontier type or absence of inefficiency
Start with half-normal distribution, then test truncated normal
Use True FE/RE when heterogeneity is a concern (most applications)
Four-component model provides the richest decomposition but requires sufficient \(N\) and \(T\)
Report BC efficiency estimates (\(E[e^{-u} \mid \varepsilon]\)) rather than JLMS (\(e^{-E[u \mid \varepsilon]}\))
Validate by checking: efficiency in \((0,1]\), reasonable mean (~0.6--0.9), expected skewness sign

Key References¶

Aigner, D., Lovell, C.A.K. & Schmidt, P. (1977). "Formulation and Estimation of Stochastic Frontier Production Function Models." Journal of Econometrics, 6(1), 21--37. --- Original SFA formulation.
Meeusen, W. & van den Broeck, J. (1977). "Efficiency Estimation from Cobb-Douglas Production Functions with Composed Error." International Economic Review, 18(2), 435--444. --- Independent development of SFA.
Jondrow, J., Lovell, C.A.K., Materov, I.S. & Schmidt, P. (1982). "On the Estimation of Technical Inefficiency in the Stochastic Frontier Production Function Model." Journal of Econometrics, 19(2--3), 233--238. --- JLMS efficiency estimator.
Battese, G.E. & Coelli, T.J. (1992). "Frontier Production Functions, Technical Efficiency and Panel Data." Journal of Productivity Analysis, 3, 153--169. --- Time-varying inefficiency with decay.
Greene, W.H. (2005). "Reconsidering Heterogeneity in Panel Data Estimators of the Stochastic Frontier Model." Journal of Econometrics, 126(2), 269--303. --- True FE/RE models.
Kumbhakar, S.C., Lien, G. & Hardaker, J.B. (2014). "Technical Efficiency in Competing Panel Data Models: A Study of Norwegian Grain Farming." Journal of Productivity Analysis, 41, 321--337. --- Four-component model.
Kumbhakar, S.C. & Lovell, C.A.K. (2000). Stochastic Frontier Analysis. Cambridge University Press. --- Comprehensive textbook.

Stochastic Frontier Theory --- Efficiency Measurement¶

Motivation¶

The Production Frontier¶

Model Specification¶

Technical Efficiency¶

The Cost Frontier¶

Dual Specification¶

Error Decomposition¶

Composite Error¶

Distributional Assumptions¶

Signal-to-Noise Ratio¶

Panel SFA Models¶

Time-Invariant Inefficiency (Pitt-Lee 1981)¶

Time-Varying Inefficiency (Battese-Coelli 1992)¶

Determinants of Inefficiency (Battese-Coelli 1995)¶

True Fixed/Random Effects (Greene 2005)¶

The Problem with Earlier Models¶

True Fixed Effects¶

True Random Effects¶

Four-Component Model¶

Motivation¶

Efficiency Measures¶

Estimation¶

Efficiency Estimation¶

Jondrow et al. (1982) — JLMS Estimator¶

Battese-Coelli (1988) — BC Estimator¶

TFP Decomposition¶

Practical Implications¶

Key References¶

See Also¶