Multinomial and Conditional Logit¶
Quick Reference
Classes: MultinomialLogit, ConditionalLogit
Import: from panelbox.models.discrete.multinomial import MultinomialLogit, ConditionalLogit
Stata equivalent: mlogit, clogit
R equivalent: nnet::multinom(), survival::clogit(), mlogit::mlogit()
Overview¶
When the dependent variable takes on multiple unordered categories -- such as transportation mode (car, bus, train), occupation (blue collar, white collar, professional), or brand choice -- standard binary or ordered models are inappropriate. PanelBox provides two models for this setting:
Multinomial Logit (MNL) models the probability of choosing alternative \(j\) based on individual-level characteristics:
with the normalization \(\beta_0 = 0\) for the base alternative. Each non-base alternative has its own coefficient vector \(\beta_j\), yielding a \((J-1) \times K\) parameter matrix.
Conditional Logit (McFadden 1974) models choice based on alternative-level attributes that vary across options:
where \(Z_{ij}\) contains attributes of alternative \(j\) (price, quality, distance). A single coefficient vector \(\gamma\) applies to all alternatives.
Quick Example¶
import numpy as np
from panelbox.models.discrete.multinomial import MultinomialLogit
# y: occupational choice (0=blue collar, 1=white collar, 2=professional)
model = MultinomialLogit(
endog=y, exog=X,
n_alternatives=3, base_alternative=0
)
result = model.fit(method="BFGS")
# Coefficient matrix: (J-1, K)
print(result.params_matrix)
# Predicted probabilities: (N, J)
probs = result.predict_proba()
print(result.summary())
import pandas as pd
from panelbox.models.discrete.multinomial import ConditionalLogit
# Long-format: one row per (choice occasion, alternative)
model = ConditionalLogit(
data=df,
choice_col="choice_id",
alt_col="alternative",
chosen_col="chosen",
alt_varying_vars=["price", "quality"],
case_varying_vars=["income"]
)
result = model.fit()
print(result.summary())
When to Use¶
- Multinomial Logit: individual-level characteristics drive the choice (education, income, demographics determine occupation)
- Conditional Logit: alternative-level attributes drive the choice (price, quality, travel time determine transportation mode)
- Both models assume IIA -- the Independence of Irrelevant Alternatives (see below)
Key Assumptions
- IIA (Independence of Irrelevant Alternatives): The relative odds between any two alternatives are independent of other alternatives. Violated when alternatives are close substitutes (e.g., the red bus/blue bus problem).
- No unobserved heterogeneity in the basic pooled specification. Use
method="fixed_effects"ormethod="random_effects"inMultinomialLogitfor panel heterogeneity. - Conditional Logit: homogeneous preferences across individuals (same \(\gamma\) for all).
Detailed Guide¶
Multinomial Logit¶
Estimation¶
The MNL model estimates \(J-1\) coefficient vectors (one per non-base alternative):
from panelbox.models.discrete.multinomial import MultinomialLogit
model = MultinomialLogit(
endog=y,
exog=X,
n_alternatives=3, # J = 3 categories
base_alternative=0, # Base category (normalized to zero)
method="pooled" # "pooled", "fixed_effects", or "random_effects"
)
result = model.fit(method="BFGS", maxiter=1000)
Panel Data Extensions¶
The MultinomialLogit supports three estimation methods for panel data:
# Pooled MNL (ignores panel structure)
model = MultinomialLogit(endog=y, exog=X, n_alternatives=3, method="pooled")
# Fixed Effects MNL (Chamberlain-style conditional MLE)
model = MultinomialLogit(
endog=y, exog=X, n_alternatives=3,
method="fixed_effects",
entity_col="id", time_col="year"
)
# Random Effects MNL (Gauss-Hermite quadrature)
model = MultinomialLogit(
endog=y, exog=X, n_alternatives=3,
method="random_effects",
entity_col="id", time_col="year"
)
Computational Considerations
Fixed effects estimation with many alternatives (\(J > 4\)) or long panels (\(T > 10\)) can be computationally intensive. For large problems, consider the pooled specification with cluster-robust standard errors, or the random effects approach.
Interpreting Coefficients¶
Coefficients represent log-odds ratios relative to the base alternative:
A coefficient \(\beta_{jk}\) means: a one-unit increase in \(x_k\) changes the log-odds of choosing alternative \(j\) (vs. the base) by \(\beta_{jk}\).
# Coefficient matrix: (J-1) x K
print("Parameters for alternative 1 vs base:")
print(result.params_matrix[0, :])
print("Parameters for alternative 2 vs base:")
print(result.params_matrix[1, :])
Marginal Effects Over Coefficients
Direct coefficient interpretation is difficult because probabilities depend on all \(\beta_j\) vectors simultaneously. Always compute marginal effects for meaningful interpretation.
Predictions and Classification¶
# Predicted probabilities for each alternative: (N, J)
probs = result.predict_proba()
# Most likely alternative for each observation
predicted = result.predict()
# Classification quality
print(f"Accuracy: {result.accuracy:.3f}")
print(f"Confusion matrix:\n{result.confusion_matrix}")
Marginal Effects¶
Marginal effects in MNL have a special structure. For alternative \(j\) and variable \(k\):
The marginal effects sum to zero across alternatives (probabilities must sum to 1).
# Marginal effects at the mean: (J, K) matrix
me = result.marginal_effects(at="mean")
# Marginal effects at the median
me_median = result.marginal_effects(at="median")
# Average marginal effects
me_overall = result.marginal_effects(at="overall")
# For a specific variable
me_var = result.marginal_effects(at="mean", variable=0)
Conditional Logit (McFadden 1974)¶
The Conditional Logit is designed for choice data where alternatives have distinct, observable attributes.
Data Format¶
Conditional Logit requires long-format data: one row per choice occasion per alternative.
import pandas as pd
# Example: transportation mode choice
# Each row: one (traveler-trip, mode) combination
# Columns: trip ID, mode, whether chosen, mode attributes, traveler attributes
df = pd.DataFrame({
"trip_id": [1, 1, 1, 2, 2, 2, 3, 3, 3],
"mode": ["car", "bus", "train", "car", "bus", "train", "car", "bus", "train"],
"chosen": [1, 0, 0, 0, 1, 0, 0, 0, 1],
"travel_time": [30, 45, 20, 15, 60, 25, 50, 40, 15],
"cost": [10, 3, 8, 5, 2, 7, 12, 4, 9],
"income": [50, 50, 50, 30, 30, 30, 80, 80, 80],
})
Estimation¶
from panelbox.models.discrete.multinomial import ConditionalLogit
model = ConditionalLogit(
data=df,
choice_col="trip_id", # Choice occasion identifier
alt_col="mode", # Alternative identifier
chosen_col="chosen", # Binary: 1 if chosen, 0 otherwise
alt_varying_vars=["travel_time", "cost"], # Attributes that vary by alternative
case_varying_vars=["income"], # Attributes that vary by individual (optional)
)
result = model.fit(method="BFGS")
print(result.summary())
Variable Types¶
| Variable Type | Description | Example | Coefficient |
|---|---|---|---|
| Alternative-varying | Different value for each alternative | Travel time, price | Single \(\gamma\) (generic) |
| Case-varying | Same value across alternatives | Income, age | \((J-1)\) coefficients (alternative-specific) |
For case-varying variables, the model estimates alternative-specific coefficients (relative to the base), similar to MNL. For alternative-varying variables, a single coefficient applies across all alternatives.
The IIA Assumption¶
The Independence of Irrelevant Alternatives states that the ratio of choice probabilities between any two alternatives is independent of other alternatives:
The Red Bus / Blue Bus Problem: If a city has car and red bus as transport options (50/50 split), adding a blue bus (identical to red) should split the bus share, yielding 50% car, 25% red bus, 25% blue bus. But IIA predicts 33/33/33 because it treats blue bus as equally distinct from car as red bus is.
When IIA Fails
If your alternatives include close substitutes, consider:
- Nested Logit: groups similar alternatives
- Mixed Logit: allows random taste variation
- Hausman-McFadden test: formal test of IIA (drop one alternative and check if remaining estimates change)
Configuration Options¶
MultinomialLogit¶
| Parameter | Type | Default | Description |
|---|---|---|---|
endog |
ndarray |
required | Categorical outcome (0 to \(J-1\)) |
exog |
ndarray |
required | Regressors |
n_alternatives |
int |
None |
Number of alternatives (inferred if None) |
base_alternative |
int |
0 |
Reference alternative |
method |
str |
"pooled" |
"pooled", "fixed_effects", "random_effects" |
entity_col |
str |
None |
Entity identifier (required for FE/RE) |
time_col |
str |
None |
Time identifier |
fit() parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
start_params |
ndarray |
None |
Starting values |
method |
str |
"BFGS" |
Optimization method |
maxiter |
int |
1000 |
Maximum iterations |
ConditionalLogit¶
| Parameter | Type | Default | Description |
|---|---|---|---|
data |
DataFrame |
required | Long-format choice data |
choice_col |
str |
required | Choice occasion identifier |
alt_col |
str |
required | Alternative identifier |
chosen_col |
str |
required | Binary chosen indicator |
alt_varying_vars |
list |
required | Alternative-varying attribute names |
case_varying_vars |
list |
None |
Case-varying attribute names |
fit() parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
start_params |
ndarray |
None |
Starting values |
method |
str |
"BFGS" |
Optimization method |
maxiter |
int |
1000 |
Maximum iterations |
Result Attributes¶
MultinomialLogitResult¶
| Attribute | Type | Description |
|---|---|---|
params |
ndarray |
Flat parameter vector (length \((J-1) \times K\)) |
params_matrix |
ndarray |
Parameters reshaped to \((J-1, K)\) |
bse |
ndarray |
Standard errors |
bse_matrix |
ndarray |
SEs reshaped to \((J-1, K)\) |
cov_params |
ndarray |
Variance-covariance matrix |
predicted_probs |
ndarray |
Predicted probabilities \((N, J)\) |
llf |
float |
Log-likelihood |
aic |
float |
Akaike Information Criterion |
bic |
float |
Bayesian Information Criterion |
pseudo_r2 |
float |
McFadden pseudo \(R^2\) |
accuracy |
float |
Classification accuracy |
confusion_matrix |
ndarray |
Confusion matrix \((J, J)\) |
converged |
bool |
Convergence flag |
iterations |
int |
Number of iterations |
ConditionalLogitResult¶
| Attribute | Type | Description |
|---|---|---|
params |
ndarray |
Estimated coefficients |
bse |
ndarray |
Standard errors |
vcov |
ndarray |
Variance-covariance matrix |
llf |
float |
Log-likelihood |
aic |
float |
Akaike Information Criterion |
bic |
float |
Bayesian Information Criterion |
pseudo_r2 |
float |
McFadden pseudo \(R^2\) |
accuracy |
float |
Classification accuracy |
converged |
bool |
Convergence flag |
Diagnostics¶
Model Fit¶
# Multinomial Logit
print(f"Log-likelihood: {result.llf:.3f}")
print(f"McFadden R²: {result.pseudo_r2:.3f}")
print(f"AIC: {result.aic:.3f}")
print(f"BIC: {result.bic:.3f}")
print(f"Accuracy: {result.accuracy:.3f}")
print(f"Confusion matrix:\n{result.confusion_matrix}")
Comparing Specifications¶
# Compare pooled vs. RE
model_pooled = MultinomialLogit(endog=y, exog=X, n_alternatives=3, method="pooled")
res_pooled = model_pooled.fit()
model_re = MultinomialLogit(
endog=y, exog=X, n_alternatives=3,
method="random_effects", entity_col="id"
)
res_re = model_re.fit()
print(f"Pooled BIC: {res_pooled.bic:.1f}")
print(f"RE BIC: {res_re.bic:.1f}")
Tutorials¶
| Tutorial | Description | Link |
|---|---|---|
| Multinomial Choice | Occupational choice and brand selection examples |
See Also¶
- Binary Choice Models -- Logit and Probit for binary outcomes
- Ordered Choice Models -- Models for ordinal outcomes with natural ordering
- Dynamic Binary Panel -- State dependence in binary choice
- Marginal Effects -- Essential for MNL coefficient interpretation
References¶
- McFadden, D. (1974). "Conditional Logit Analysis of Qualitative Choice Behavior." In Frontiers in Econometrics, ed. P. Zarembka. Academic Press.
- McFadden, D. (1981). "Econometric Models of Probabilistic Choice." In Structural Analysis of Discrete Data, ed. C. Manski and D. McFadden. MIT Press.
- Train, K. (2009). Discrete Choice Methods with Simulation. 2nd ed. Cambridge University Press.
- Hausman, J. and McFadden, D. (1984). "Specification Tests for the Multinomial Logit Model." Econometrica, 52(5), 1219-1240.
- Cameron, A. C. and Trivedi, P. K. (2005). Microeconometrics: Methods and Applications. Cambridge University Press. Chapter 15.