Marginal Effects for Count Data¶

Quick Reference

Method: results.marginal_effects(at="overall") PPML: results.elasticity(variable), results.elasticities() Stata equivalent: margins, dydx(*) predict(n) R equivalent: margins::margins(), marginaleffects::marginaleffects()

Overview¶

Count data models use a nonlinear link function --- \(E[y \mid X] = \exp(X'\beta)\) --- which means the raw coefficients (\(\beta\)) do not directly measure the effect of a unit change in \(x\) on the expected count. Instead, \(\beta\) measures the semi-elasticity: the proportional change in \(E[y]\).

To communicate results in terms of actual count changes, researchers compute marginal effects. This page covers marginal effects for all PanelBox count models: Poisson, Negative Binomial, PPML, and Zero-Inflated models.

Quick Example¶

from panelbox.models.count import PooledPoisson
import numpy as np

model = PooledPoisson(
    endog=data["patents"],
    exog=data[["rd_spending", "employees"]],
    entity_id=data["firm"],
    time_id=data["year"]
)
results = model.fit(se_type="cluster")

# Average Marginal Effects
ame = results.marginal_effects(at="overall")
print(ame)

Three Ways to Read Coefficients¶

Count model coefficients can be interpreted in three equivalent ways:

1. Semi-Elasticity (Direct Coefficient)¶

The coefficient \(\beta_k\) is the semi-elasticity:

\[\frac{\partial \ln E[y \mid X]}{\partial x_k} = \beta_k\]

A one-unit increase in \(x_k\) changes \(E[y]\) by approximately \(100 \times \beta_k\) percent.

# Direct reading from coefficients
for name, coef in zip(results.exog_names, results.params):
    print(f"{name}: 1-unit increase -> {100*coef:.1f}% change in E[y]")

2. Incidence Rate Ratio (Exponentiated Coefficient)¶

The IRR is \(\exp(\beta_k)\), giving a multiplicative interpretation:

\[\frac{E[y \mid x_k + 1]}{E[y \mid x_k]} = \exp(\beta_k)\]

A unit increase in \(x_k\) multiplies the expected count by \(\exp(\beta_k)\).

# Incidence Rate Ratios
for name, coef in zip(results.exog_names, results.params):
    irr = np.exp(coef)
    print(f"{name}: IRR = {irr:.4f}")
    if irr > 1:
        print(f"  -> multiplies E[y] by {irr:.3f} ({100*(irr-1):.1f}% increase)")
    else:
        print(f"  -> multiplies E[y] by {irr:.3f} ({100*(1-irr):.1f}% decrease)")

3. Marginal Effect (Actual Count Change)¶

The marginal effect gives the change in the expected count for a unit change in \(x_k\):

\[\frac{\partial E[y \mid X]}{\partial x_k} = \beta_k \cdot \exp(X'\beta) = \beta_k \cdot E[y \mid X]\]

Because this depends on \(X\), it varies across observations. Two summary measures are standard:

Measure	Formula	Description
AME (Average Marginal Effect)	\(\frac{1}{N}\sum_i \beta_k \cdot \exp(X_i'\beta)\)	Average over all observations
MEM (Marginal Effect at Means)	\(\beta_k \cdot \exp(\bar{X}'\beta)\)	Evaluate at sample means

Computing Marginal Effects¶

Poisson and Negative Binomial¶

All Poisson and NB models support the marginal_effects() method:

# Average Marginal Effects (AME)
ame = results.marginal_effects(at="overall")

# Marginal Effects at Means (MEM)
mem = results.marginal_effects(at="means")

# Subset of variables
ame_subset = results.marginal_effects(at="overall", varlist=["rd_spending"])

The at parameter controls the evaluation point:

Value	Type	Description
`"overall"` or `"mean"`	AME	Average over all observations
`"means"` or `"mem"`	MEM	Evaluate at sample means

PPML Elasticities¶

PPML provides specialized elasticity methods for gravity models:

from panelbox.models.count import PPML

model = PPML(
    endog=df["trade_flow"],
    exog=df[["log_distance", "log_gdp_exp", "log_gdp_imp", "rta"]],
    entity_id=df["pair_id"],
    time_id=df["year"],
    fixed_effects=True,
    exog_names=["log_distance", "log_gdp_exp", "log_gdp_imp", "rta"]
)
results = model.fit()

# Elasticity for a specific variable
dist_elast = results.elasticity("log_distance")
print(f"Distance elasticity: {dist_elast['elasticity']:.3f}")
print(f"SE: {dist_elast['elasticity_se']:.3f}")

# All elasticities as a DataFrame
print(results.elasticities())

For log-transformed variables: the coefficient is the elasticity directly:

\[\frac{\partial \ln E[y]}{\partial \ln x} = \beta\]

For level variables: the coefficient is a semi-elasticity. The percentage effect is:

\[\text{Percentage change} = 100 \times (\exp(\beta) - 1)\%\]

# Binary variable interpretation (e.g., RTA)
rta_coef = results.params[3]  # assuming RTA is 4th variable
pct_effect = 100 * (np.exp(rta_coef) - 1)
print(f"RTA increases trade by {pct_effect:.1f}%")

Zero-Inflated Marginal Effects¶

For ZI models, the overall marginal effect combines contributions from both the inflation and count components. The expected value is:

\[E[y \mid X, Z] = (1 - \pi) \cdot \lambda\]

where \(\pi = \Lambda(Z'\gamma)\) and \(\lambda = \exp(X'\beta)\). If a variable \(x_k\) appears in both parts, the total marginal effect is:

\[\frac{\partial E[y]}{\partial x_k} = \underbrace{-\frac{\partial \pi}{\partial x_k} \cdot \lambda}_{\text{inflation effect}} + \underbrace{(1 - \pi) \cdot \beta_k \lambda}_{\text{count effect}}\]

The first term captures how changes in \(x_k\) affect the probability of being a structural zero. The second term captures how changes in \(x_k\) affect the count among non-structural-zero observations.

from panelbox.models.count import ZeroInflatedPoisson

model = ZeroInflatedPoisson(
    endog=data["patents"],
    exog_count=data[["rd_spending", "employees"]],
    exog_inflate=data[["small_firm", "new_entrant"]]
)
results = model.fit()

# Predictions for decomposition
overall_mean = results.predict(which="mean")
count_mean = results.predict(which="count-mean")
pi_hat = results.predict(which="prob-zero-structural")

# Manual AME for count variable (rd_spending)
beta_rd = results.params_count[0]
ame_rd = np.mean((1 - pi_hat) * beta_rd * count_mean)
print(f"AME of rd_spending: {ame_rd:.4f}")

# Manual AME for inflation variable (small_firm)
gamma_small = results.params_inflate[0]
# Logit derivative: pi * (1-pi) * gamma
ame_inflate = np.mean(-pi_hat * (1 - pi_hat) * gamma_small * count_mean)
print(f"AME of small_firm (via inflation): {ame_inflate:.4f}")

Comparison Across Models¶

The table below shows how marginal effects differ across count model families:

Model	ME Formula	Notes
Poisson	\(\beta_k \cdot \exp(X'\beta)\)	Simple exponential
NB	\(\beta_k \cdot \exp(X'\beta)\)	Same formula, different estimates
PPML	Same as Poisson	Use `elasticity()` for trade variables
ZIP	\((1-\pi) \beta_k \lambda - \frac{\partial\pi}{\partial x_k} \lambda\)	Two-component decomposition
ZINB	Same as ZIP	With NB count component

Reporting Best Practices¶

What to Report

Always report IRR (\(\exp(\beta)\)) alongside raw coefficients for accessibility
AME for policy analysis (in natural units: "1 more year of education increases expected patents by 0.3")
Elasticities for PPML / gravity models ("1% increase in GDP increases trade by 0.8%")
Both parts for ZI models (inflation effects and count effects separately)

import pandas as pd

# Create a comprehensive results table
table = pd.DataFrame({
    "Variable": results.exog_names,
    "Coefficient": results.params,
    "SE": results.se,
    "IRR": np.exp(results.params),
    "pct_change": 100 * (np.exp(results.params) - 1),
})
print(table.to_string(index=False, float_format="%.4f"))

Tutorials¶

Tutorial	Description	Link
Count Data Models	Marginal effects across all count specifications

References¶

Cameron, A. C., & Trivedi, P. K. (2013). Regression Analysis of Count Data (2^nd ed.). Cambridge University Press, Chapter 2.6.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2^nd ed.). MIT Press, Chapter 18.
Long, J. S., & Freese, J. (2014). Regression Models for Categorical Dependent Variables Using Stata (3^rd ed.). Stata Press.
Santos Silva, J. M. C., & Tenreyro, S. (2006). The Log of Gravity. Review of Economics and Statistics, 88(4), 641--658.