Count Data Tutorials¶
Learning Path
Prerequisites: Static Models tutorials, basic MLE concepts Time: 3--6 hours Level: Beginner -- Advanced
Overview¶
Count data models apply when the dependent variable is a non-negative integer: patent counts, number of doctor visits, trade flows, accident frequencies. Standard linear regression is inappropriate because it can predict negative values and ignores the discrete, non-negative nature of count data.
These tutorials cover the Poisson model (pooled, fixed effects, random effects), quasi-maximum likelihood Poisson (QML), the Pseudo Poisson Maximum Likelihood estimator (PPML) for gravity models in trade, negative binomial models for overdispersion, and zero-inflated models for excess zeros.
The PPML Gravity notebook provides a self-contained introduction to gravity models using PPML.
Notebooks¶
| # | Tutorial | Level | Time | Colab |
|---|---|---|---|---|
| 1 | Poisson Introduction | Beginner | 45 min | |
| 2 | Negative Binomial | Intermediate | 45 min | |
| 3 | FE/RE Count Models | Intermediate | 45 min | |
| 4 | PPML Gravity Models | Intermediate | 60 min | |
| 5 | Zero-Inflated Models | Advanced | 45 min | |
| 6 | Marginal Effects for Count | Advanced | 45 min | |
| 7 | Innovation Case Study | Advanced | 60 min |
Learning Paths¶
Core (3 hours)¶
Essential count data methods:
Notebooks: 1, 2, 3, 4
Covers Poisson, negative binomial, FE/RE count models, and PPML for gravity equations.
Complete (6 hours)¶
Full count data analysis coverage:
Notebooks: 1--7
Adds zero-inflated models, marginal effects, overdispersion diagnostics, and a complete case study.
Key Concepts Covered¶
- Poisson regression: Exponential mean function, equidispersion assumption
- FE Poisson: Conditional MLE for panel count data
- RE Poisson: Random effects with Gauss-Hermite quadrature
- QML Poisson: Robust to distributional misspecification
- PPML: Pseudo Poisson for gravity models (Santos Silva & Tenreyro, 2006)
- Negative binomial: Accommodating overdispersion
- Zero-inflation: Modeling excess zeros with a two-part model
- Overdispersion tests: Cameron-Trivedi, Dean's score test
- Marginal effects: Incidence rate ratios and semi-elasticities
Quick Example¶
from panelbox.models.count import PoissonFE, PPML
# FE Poisson
poisson = PoissonFE(
data=data,
formula="patents ~ rd_spending + firm_size",
entity_col="firm",
time_col="year"
).fit()
print(poisson.summary())
# PPML for gravity
ppml = PPML(
data=trade_data,
formula="trade_flow ~ log_gdp_i + log_gdp_j + log_distance",
entity_col="pair",
time_col="year"
).fit()
Solutions¶
| Tutorial | Solution |
|---|---|
| 01. Poisson Introduction | Solution |
| 02. Negative Binomial | Solution |
| 03. FE/RE Count | Solution |
| 04. PPML Gravity | Solution |
| 05. Zero-Inflated | Solution |
| 06. Marginal Effects | Solution |
| 07. Case Study | Solution |
Related Documentation¶
- PPML Gravity Tutorial -- Self-contained gravity model notebook
- Marginal Effects Tutorials -- AME for nonlinear models
- User Guide -- API reference