Skip to content

Count Data Tutorials

Learning Path

Prerequisites: Static Models tutorials, basic MLE concepts Time: 3--6 hours Level: Beginner -- Advanced

Overview

Count data models apply when the dependent variable is a non-negative integer: patent counts, number of doctor visits, trade flows, accident frequencies. Standard linear regression is inappropriate because it can predict negative values and ignores the discrete, non-negative nature of count data.

These tutorials cover the Poisson model (pooled, fixed effects, random effects), quasi-maximum likelihood Poisson (QML), the Pseudo Poisson Maximum Likelihood estimator (PPML) for gravity models in trade, negative binomial models for overdispersion, and zero-inflated models for excess zeros.

The PPML Gravity notebook provides a self-contained introduction to gravity models using PPML.

Notebooks

# Tutorial Level Time Colab
1 Poisson Introduction Beginner 45 min Open In Colab
2 Negative Binomial Intermediate 45 min Open In Colab
3 FE/RE Count Models Intermediate 45 min Open In Colab
4 PPML Gravity Models Intermediate 60 min Open In Colab
5 Zero-Inflated Models Advanced 45 min Open In Colab
6 Marginal Effects for Count Advanced 45 min Open In Colab
7 Innovation Case Study Advanced 60 min Open In Colab

Learning Paths

Core (3 hours)

Essential count data methods:

Notebooks: 1, 2, 3, 4

Covers Poisson, negative binomial, FE/RE count models, and PPML for gravity equations.

Complete (6 hours)

Full count data analysis coverage:

Notebooks: 1--7

Adds zero-inflated models, marginal effects, overdispersion diagnostics, and a complete case study.

Key Concepts Covered

  • Poisson regression: Exponential mean function, equidispersion assumption
  • FE Poisson: Conditional MLE for panel count data
  • RE Poisson: Random effects with Gauss-Hermite quadrature
  • QML Poisson: Robust to distributional misspecification
  • PPML: Pseudo Poisson for gravity models (Santos Silva & Tenreyro, 2006)
  • Negative binomial: Accommodating overdispersion
  • Zero-inflation: Modeling excess zeros with a two-part model
  • Overdispersion tests: Cameron-Trivedi, Dean's score test
  • Marginal effects: Incidence rate ratios and semi-elasticities

Quick Example

from panelbox.models.count import PoissonFE, PPML

# FE Poisson
poisson = PoissonFE(
    data=data,
    formula="patents ~ rd_spending + firm_size",
    entity_col="firm",
    time_col="year"
).fit()

print(poisson.summary())

# PPML for gravity
ppml = PPML(
    data=trade_data,
    formula="trade_flow ~ log_gdp_i + log_gdp_j + log_distance",
    entity_col="pair",
    time_col="year"
).fit()

Solutions

Tutorial Solution
01. Poisson Introduction Solution
02. Negative Binomial Solution
03. FE/RE Count Solution
04. PPML Gravity Solution
05. Zero-Inflated Solution
06. Marginal Effects Solution
07. Case Study Solution