NFL 4th-Down Decision Model
Small multi-task MLP that recommends go / field goal / punt on 4th down by
predicting the change in win probability (ΔWP) for each action.
Trained on 2016-2024 NFL play-by-play data with labels generated by
nfl4th::add_4th_probs()
.
Model description
- Inputs (23 features): quarter, game/half seconds remaining, yardline,
yards-to-go, score differential, timeouts, weather, point spread, and one-hot
roof/surface indicators, plus derived helpers (e.g.,
fg_dist_yd
,in_fg_range
). Seefeature_order.json
for exact schema. - Outputs: ΔWP for go/field-goal/punt (main head) and auxiliary heads predicting FG make and 4th-down conversion probabilities.
- Recommendation: choose the action with highest ΔWP.
Training
- Split: 2016-2021 train, 2022 validation, 2023-2024 test.
- Objective: Huber loss for ΔWP + 0.2·MSE(aux).
- Optimizer/regularization: AdamW, BatchNorm, Dropout, L2.
- Artifacts: best weights saved as
mlp_dwp_plus_aux_best.keras
.
Evaluation (2023–2024 test set)
Metric | Value |
---|---|
Policy agreement | 0.794 |
Mean regret (ΔWP) | 0.0023 |
Late & close (≤600 s, abs(score) ≤ 8) agreement | 0.744 |
Late & close mean regret | 0.0099 |
(ΔWP reported in win-probability points; lower regret is better.)
Usage
Load from this repo (Hugging Face Hub)
from huggingface_hub import hf_hub_download
import tensorflow as tf
import numpy as np
repo_id = "YOUR_USERNAME/YOUR_REPO" # change to your model repo
weights = "mlp_dwp_plus_aux_best.keras"
path = hf_hub_download(repo_id=repo_id, filename=weights)
model = tf.keras.models.load_model(path)
# example input (dict of feature: np.array([value]))
x = {
"quarter": np.array([2.0]),
"game_seconds_remaining": np.array([1800.0]),
"half_seconds_remaining": np.array([900.0]),
"yardline_100": np.array([60.0]),
"ydstogo": np.array([4.0]),
"score_differential": np.array([3.0]),
"receive_2h_ko": np.array([1.0]),
"timeouts_off": np.array([3.0]),
"timeouts_def": np.array([3.0]),
"timeouts_total": np.array([6.0]),
"timeouts_diff": np.array([0.0]),
"temp_f": np.array([70.0]),
"wind_mph": np.array([5.0]),
"spread_line": np.array([0.0]),
"fg_dist_yd": np.array([77.0]),
"in_fg_range": np.array([0.0]),
"one_score": np.array([1.0]),
"late_game": np.array([0.0]),
"second_half": np.array([0.0]),
"roof": np.array(["outdoors"]),
"surface": np.array(["grass"]),
}
pred = model.predict(x, verbose=0)["dwp"][0] # ΔWP for [go, FG, punt]
actions = ["Go for it", "Field goal", "Punt"]
print(dict(zip(actions, pred)))
Intended use & limitations
Purpose: Educational decision assistant for 4th‑down scenarios.
Not for wagering or real‑time coaching; relies on historical
probabilities and inherits assumptions from nfl4th.
Weather and wp_punt imputation introduce simplifications; apply domain
judgment when using recommendations.
Citation
If you use this model, please cite the original project:
Villagran, M. (2025). NFL 4th-Down WinPct Recommender. MIT License.
- Downloads last month
- 8
Evaluation results
- Policy agreement (top-ΔWP) on NFL play-by-play (2016–2024)self-reported0.794
- Mean regret ΔWP (overall) on NFL play-by-play (2016–2024)self-reported0.002
- Policy agreement (late & close) on NFL play-by-play (2016–2024)self-reported0.744
- Mean regret ΔWP (late & close) on NFL play-by-play (2016–2024)self-reported0.010