AutoExperiment API Reference¶

Module

Import: from panelbox.autoexperiment import AutoExperiment, AutoExperimentResults, VariableTransformer, VariableSelector, ModelRanker, AutoCovTypeSelector, SpecificationFilter Source: panelbox/autoexperiment/

Overview¶

AutoExperiment automates the full panel data modeling pipeline: variable transformation, forward stepwise selection, multi-model estimation, econometric validation, and composite ranking. It is designed for researchers who want a systematic, reproducible approach to model selection while respecting econometric best practices.

Class	Description
`AutoExperiment`	Main orchestrator — runs the full pipeline
`AutoExperimentResults`	Container for results, ranking, and report generation
`VariableTransformer`	Generates lags, diffs, logs, growth rates, and more
`VariableSelector`	Forward stepwise variable selection by BIC/AIC
`ModelRanker`	Composite scoring and ranking of candidate models
`AutoCovTypeSelector`	Automatic standard error selection based on diagnostics
`SpecificationFilter`	Classifies specifications as VALID / WARNING / INVALID

AutoExperiment¶

The main class that orchestrates the entire pipeline.

from panelbox.autoexperiment import AutoExperiment

auto = AutoExperiment(
    data=df,
    depvar="invest",
    entity_col="firm",
    time_col="year",
    candidates=["value", "capital"],
    transformations={"lag": [1, 2], "diff": True, "log": True},
    models=["pooled_ols", "fe", "re", "fd"],
    criterion="bic",
    max_vars=8,
    sign_constraints={"value": "+", "capital": "+"},
)
results = auto.run()

Constructor Parameters¶

Parameter	Type	Default	Description
`data`	`pd.DataFrame`	required	Panel data in long format
`depvar`	`str`	required	Dependent variable name
`entity_col`	`str`	required	Cross-sectional identifier column
`time_col`	`str`	required	Time identifier column
`candidates`	`list[str]`	`None`	Candidate regressors. If `None`, uses all numeric columns except `depvar`, `entity_col`, `time_col`
`transformations`	`dict`	`{'lag': [1], 'diff': True}`	Transformation specification (see VariableTransformer)
`models`	`list[str]`	`['pooled_ols', 'fe', 're', 'fd']`	Model types to estimate
`criterion`	`str`	`'bic'`	Information criterion: `'bic'` or `'aic'`
`max_vars`	`int`	`8`	Maximum regressors per model
`min_obs_per_var`	`int`	`10`	Minimum observations per variable (caps `max_vars` if needed)
`prefilter_corr`	`float`	`0.05`	Minimum absolute correlation with `depvar` to keep a candidate
`require_tests`	`bool`	`True`	If `True`, only VALID/WARNING models enter the ranking
`alpha`	`float`	`0.05`	Significance level for all diagnostic tests
`required_tests`	`list[str]`	`None`	Override the default per-model test list in `SpecificationFilter`
`sign_constraints`	`dict[str, str]`	`None`	`{variable: '+' or '-'}` — economic theory sign constraints
`max_combinations`	`int`	`500`	Budget cap for total model evaluations
`n_jobs`	`int`	`1`	Reserved for future parallel support
`verbose`	`int`	`1`	`0` = silent, `1` = progress bars, `2` = detailed logging
`random_state`	`int`	`None`	Random seed (reserved for future use)

Valid Model Types¶

Alias	Model
`'pooled_ols'`	Pooled OLS
`'fe'`	Fixed Effects (within estimator)
`'re'`	Random Effects (GLS)
`'fd'`	First Difference

`run()`¶

auto.run() -> AutoExperimentResults

Executes the full six-phase pipeline:

Variable transformations — generates lags, diffs, logs, etc.
Forward stepwise selection — per model type, using BIC/AIC
Estimation & validation — fits each model, runs diagnostic tests
Hausman test — if both FE and RE are present, marks the loser as WARNING
Ranking — composite score combining criterion + tests + signs + parsimony
Results assembly — packages everything into AutoExperimentResults

Returns: AutoExperimentResults

AutoExperimentResults¶

Container for all outputs produced by AutoExperiment.run().

Attributes¶

Attribute	Type	Description
`best_model`	`PanelResults` or `None`	Fitted result object for the best model
`best_formula`	`str`	Formula of the best model
`best_estimator`	`str`	Canonical model type (`'fe'`, `'re'`, etc.)
`best_cov_type`	`str`	Standard error type auto-selected for the best model
`ranking`	`pd.DataFrame`	Model ranking table sorted by composite score
`all_results`	`dict`	`{model_name: {'results', 'metrics', 'classification', ...}}`
`transformations_used`	`dict`	`{transformed_var: (source_var, transformation_type)}`
`variable_selection`	`dict`	`{model_type: {'selected_vars', 'rejected_vars', 'bic_path'}}`
`test_results`	`dict`	`{model_name: {'classification', 'passed', 'failed'}}`
`n_combinations_tested`	`int`	Total number of model/variable combinations evaluated
`datamining_warning`	`bool`	`True` if > 100 combinations were tested

Methods¶

`summary()`¶

results.summary() -> str

Returns a human-readable text summary with the best model, formula, BIC, R-squared, classification status, and data mining warning if applicable.

`compare_top(n=5)`¶

results.compare_top(n=5) -> pd.DataFrame

Returns the top n models with columns: model_type, formula, bic, aic, rsq, rsq_adj, classification, n_vars, cov_type, score.

`plot_bic_comparison()`¶

results.plot_bic_comparison() -> matplotlib.figure.Figure

Horizontal bar chart of BIC by model, color-coded by classification (green=VALID, yellow=WARNING, red=INVALID).

`plot_variable_importance()`¶

results.plot_variable_importance() -> matplotlib.figure.Figure

Bar chart showing how many times each variable was selected across model types.

`report(filepath=None)`¶

results.report(filepath="report.html") -> str

Generates a comprehensive HTML report with:

Summary box (best model, formula, cov type)
Data mining warning (if applicable)
Model ranking table
Variable selection details per model
Diagnostic test results

Parameter	Type	Default	Description
`filepath`	`str`	`None`	If provided, saves the HTML to this path

Returns: HTML content as string.

VariableTransformer¶

Generates transformed variables respecting the panel structure (within-entity operations).

from panelbox.autoexperiment import VariableTransformer

transformer = VariableTransformer(
    data=df,
    entity_col="firm",
    time_col="year",
    transformations={"lag": [1, 2], "diff": True, "log": True, "growth": True, "sq": True},
    nan_threshold=0.30,
)
transformed_data = transformer.transform(["value", "capital"])

Constructor Parameters¶

Parameter	Type	Default	Description
`data`	`pd.DataFrame`	required	Panel data in long format
`entity_col`	`str`	required	Entity identifier column
`time_col`	`str`	required	Time identifier column
`transformations`	`dict`	`{'lag': [1], 'diff': True}`	Transformation specification (see below)
`nan_threshold`	`float`	`0.30`	Maximum fraction of NaN allowed per transformed variable (discarded if exceeded)

Supported Transformations¶

Key	Value	Naming	Example
`'lag'`	`list[int]`	`L{k}_var`	`L1_value`, `L2_value`
`'diff'`	`bool`	`D_var`	`D_value`
`'log'`	`bool`	`log_var`	`log_value`
`'acum'`	`list[int]`	`acum{k}_var`	`acum3_value`, `acum6_value`
`'growth'`	`bool`	`growth_var`	`growth_value`
`'sq'`	`bool`	`sq_var`	`sq_value`

Data Quality

Variables with x <= 0 are skipped for log transformation.
Highly correlated transformed variables (> 0.95) are automatically removed.
Variables exceeding nan_threshold are discarded.

Methods¶

Method	Returns	Description
`transform(variables)`	`pd.DataFrame`	Generates all transformations, returns data with new columns
`get_transformation_map()`	`dict`	`{transformed_var: (source_var, type)}`
`get_valid_transformations()`	`list[str]`	Names of non-discarded transformed columns
`summary()`	`str`	Summary of generated, valid, and discarded transformations

VariableSelector¶

Forward stepwise variable selection using BIC or AIC, with optional sign constraints.

from panelbox.autoexperiment import VariableSelector

selector = VariableSelector(
    depvar="invest",
    candidates=["value", "capital", "L1_value", "D_capital"],
    entity_col="firm",
    time_col="year",
    criterion="bic",
    max_vars=5,
    sign_constraints={"value": "+", "capital": "+"},
)
result = selector.forward_stepwise(data=df, model_class="fe")

Constructor Parameters¶

Parameter	Type	Default	Description
`depvar`	`str`	required	Dependent variable name
`candidates`	`list[str]`	required	Candidate regressor names
`entity_col`	`str`	required	Entity identifier column
`time_col`	`str`	required	Time identifier column
`criterion`	`str`	`'bic'`	`'bic'` or `'aic'`
`max_vars`	`int`	`8`	Maximum variables to select
`min_obs_per_var`	`int`	`10`	Minimum observations per variable
`prefilter_corr`	`float`	`0.05`	Minimum absolute correlation with depvar for pre-filtering
`sign_constraints`	`dict[str, str]`	`None`	`{variable: '+' or '-'}`

Methods¶

`forward_stepwise(data, model_class)`¶

Runs the forward stepwise algorithm. At each step, the candidate that most improves the criterion is added, subject to sign constraints.

Returns: dict with keys:

Key	Type	Description
`selected_vars`	`list[str]`	Variables selected in order
`bic_path`	`list[float]`	Criterion value at each step
`rejected_vars`	`dict`	`{var: reason}` — why each rejected variable was excluded
`n_steps`	`int`	Number of steps taken

Sign Constraints on Transformed Variables

Sign constraints defined on source variables are automatically inherited by their transformations. For example, {'value': '+'} also constrains L1_value, log_value, and sq_value. Differenced/growth transforms (D_value, growth_value) may invert the expected sign.

ModelRanker¶

Ranks models by a weighted composite score combining four components.

from panelbox.autoexperiment import ModelRanker

ranker = ModelRanker(weights={
    "criterion": 0.50,
    "tests": 0.30,
    "signs": 0.15,
    "parsimony": 0.05,
})
ranking_df = ranker.rank(model_evaluations)

Composite Score Formula¶

The score for each model is:

\[ \text{score} = w_c \cdot S_{\text{criterion}} + w_t \cdot S_{\text{tests}} + w_s \cdot S_{\text{signs}} + w_p \cdot S_{\text{parsimony}} \]

Component	Weight (default)	Computation
Criterion	50%	Min-max normalized BIC (lower BIC = higher score)
Tests	30%	Fraction of diagnostic tests passed
Signs	15%	Fraction of sign constraints satisfied
Parsimony	5%	`1 - (n_vars / max_vars)`

INVALID models are always ranked last, regardless of score.

Constructor Parameters¶

Parameter	Type	Default	Description
`weights`	`dict[str, float]`	`{'criterion': 0.50, 'tests': 0.30, 'signs': 0.15, 'parsimony': 0.05}`	Weights are normalized to sum to 1.0

`rank(model_evaluations)`¶

ranker.rank(model_evaluations: list[dict]) -> pd.DataFrame

Returns: DataFrame with columns: rank, model_type, formula, score, bic, aic, rsq_adj, classification, n_vars, cov_type, tests_passed, signs_ok.

AutoCovTypeSelector¶

Selects the most appropriate standard error type based on diagnostic test results.

Decision Hierarchy¶

Priority	Condition	Selected `cov_type`
1	Pesaran CD rejects (cross-sectional dependence)	`'driscoll_kraay'`
2	Wooldridge/Breusch-Godfrey rejects (serial correlation)	`'newey_west'`
3	Modified Wald/Breusch-Pagan/White rejects (heteroskedasticity)	`'robust'`
4	Nothing rejects, N <= 30	`'clustered'`
4	Nothing rejects, N > 30	`'nonrobust'`

Methods¶

Method	Returns	Description
`select(validation_report, n_entities)`	`str`	Returns the recommended `cov_type`
`explain(validation_report, n_entities)`	`str`	Human-readable justification for the choice

SpecificationFilter¶

Classifies panel model specifications based on diagnostic test results.

Default Tests per Model Type¶

Model	Tests
`pooled_ols`	Breusch-Pagan, RESET
`fe`	Pesaran CD, Wooldridge, Modified Wald, RESET
`re`	Mundlak, Breusch-Pagan, RESET
`fd`	Breusch-Pagan, RESET

Critical Tests¶

Failing a critical test makes the specification INVALID:

RESET — functional form misspecification
Mundlak — correlated random effects (RE is inconsistent)

Classification Logic¶

Classification	Condition
VALID	All required tests pass
WARNING	Some non-critical tests fail
INVALID	At least one critical test fails

Methods¶

`classify(model_type, validation_report, alpha=None)`¶

filter.classify("fe", validation_report) -> dict

Returns: dict with classification, passed_tests, failed_tests, critical_failures, auto_cov_type.

`hausman_decision(data, formula, entity_col, time_col, alpha=None)`¶

filter.hausman_decision(df, "y ~ x1 + x2", "firm", "year") -> str

Runs the Hausman test. Returns 'fe' if the null is rejected (use FE), 're' otherwise.

Complete Example¶

from panelbox.datasets import load_grunfeld
from panelbox.autoexperiment import AutoExperiment

data = load_grunfeld()

auto = AutoExperiment(
    data=data,
    depvar="invest",
    entity_col="firm",
    time_col="year",
    candidates=["value", "capital"],
    transformations={"lag": [1, 2], "diff": True, "log": True},
    models=["pooled_ols", "fe", "re", "fd"],
    criterion="bic",
    max_vars=6,
    sign_constraints={"value": "+", "capital": "+"},
    verbose=1,
)

results = auto.run()

# Summary
print(results.summary())

# Top 3 models
print(results.compare_top(3))

# Best model coefficients
print(results.best_model.summary())

# HTML report
results.report("autoexperiment_report.html")

# Plots
fig1 = results.plot_bic_comparison()
fig2 = results.plot_variable_importance()

AutoExperiment API Reference¶

Overview¶

AutoExperiment¶

Constructor Parameters¶

Valid Model Types¶

run()¶

AutoExperimentResults¶

Attributes¶

Methods¶

summary()¶

compare_top(n=5)¶

plot_bic_comparison()¶

plot_variable_importance()¶

report(filepath=None)¶

VariableTransformer¶

Constructor Parameters¶

Supported Transformations¶

Methods¶

VariableSelector¶

Constructor Parameters¶

Methods¶

forward_stepwise(data, model_class)¶

ModelRanker¶

Composite Score Formula¶

Constructor Parameters¶

rank(model_evaluations)¶

AutoCovTypeSelector¶

Decision Hierarchy¶

Methods¶

SpecificationFilter¶

Default Tests per Model Type¶

Critical Tests¶

Classification Logic¶

Methods¶

classify(model_type, validation_report, alpha=None)¶

hausman_decision(data, formula, entity_col, time_col, alpha=None)¶

Complete Example¶

See Also¶

`run()`¶

`summary()`¶

`compare_top(n=5)`¶

`plot_bic_comparison()`¶

`plot_variable_importance()`¶

`report(filepath=None)`¶

`forward_stepwise(data, model_class)`¶

`rank(model_evaluations)`¶

`classify(model_type, validation_report, alpha=None)`¶

`hausman_decision(data, formula, entity_col, time_col, alpha=None)`¶