NFL 4th-Down Decision Model

Small multi-task MLP that recommends go / field goal / punt on 4th down by predicting the change in win probability (ΔWP) for each action.
Trained on 2016-2024 NFL play-by-play data with labels generated by nfl4th::add_4th_probs().

Model description

  • Inputs (23 features): quarter, game/half seconds remaining, yardline, yards-to-go, score differential, timeouts, weather, point spread, and one-hot roof/surface indicators, plus derived helpers (e.g., fg_dist_yd, in_fg_range). See feature_order.json for exact schema.
  • Outputs: ΔWP for go/field-goal/punt (main head) and auxiliary heads predicting FG make and 4th-down conversion probabilities.
  • Recommendation: choose the action with highest ΔWP.

Training

  • Split: 2016-2021 train, 2022 validation, 2023-2024 test.
  • Objective: Huber loss for ΔWP + 0.2·MSE(aux).
  • Optimizer/regularization: AdamW, BatchNorm, Dropout, L2.
  • Artifacts: best weights saved as mlp_dwp_plus_aux_best.keras.

Evaluation (2023–2024 test set)

Metric Value
Policy agreement 0.794
Mean regret (ΔWP) 0.0023
Late & close (≤600 s, abs(score) ≤ 8) agreement 0.744
Late & close mean regret 0.0099

(ΔWP reported in win-probability points; lower regret is better.)

Usage

Load from this repo (Hugging Face Hub)

from huggingface_hub import hf_hub_download
import tensorflow as tf
import numpy as np

repo_id = "YOUR_USERNAME/YOUR_REPO"  # change to your model repo
weights = "mlp_dwp_plus_aux_best.keras"

path = hf_hub_download(repo_id=repo_id, filename=weights)
model = tf.keras.models.load_model(path)

# example input (dict of feature: np.array([value]))
x = {
    "quarter": np.array([2.0]),
    "game_seconds_remaining": np.array([1800.0]),
    "half_seconds_remaining": np.array([900.0]),
    "yardline_100": np.array([60.0]),
    "ydstogo": np.array([4.0]),
    "score_differential": np.array([3.0]),
    "receive_2h_ko": np.array([1.0]),
    "timeouts_off": np.array([3.0]),
    "timeouts_def": np.array([3.0]),
    "timeouts_total": np.array([6.0]),
    "timeouts_diff": np.array([0.0]),
    "temp_f": np.array([70.0]),
    "wind_mph": np.array([5.0]),
    "spread_line": np.array([0.0]),
    "fg_dist_yd": np.array([77.0]),
    "in_fg_range": np.array([0.0]),
    "one_score": np.array([1.0]),
    "late_game": np.array([0.0]),
    "second_half": np.array([0.0]),
    "roof": np.array(["outdoors"]),
    "surface": np.array(["grass"]),
}

pred = model.predict(x, verbose=0)["dwp"][0]  # ΔWP for [go, FG, punt]
actions = ["Go for it", "Field goal", "Punt"]
print(dict(zip(actions, pred)))

Intended use & limitations

Purpose: Educational decision assistant for 4th‑down scenarios.

Not for wagering or real‑time coaching; relies on historical
probabilities and inherits assumptions from nfl4th.

Weather and wp_punt imputation introduce simplifications; apply domain
judgment when using recommendations.

Citation

If you use this model, please cite the original project:

Villagran, M. (2025). NFL 4th-Down WinPct Recommender. MIT License.

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

  • Policy agreement (top-ΔWP) on NFL play-by-play (2016–2024)
    self-reported
    0.794
  • Mean regret ΔWP (overall) on NFL play-by-play (2016–2024)
    self-reported
    0.002
  • Policy agreement (late & close) on NFL play-by-play (2016–2024)
    self-reported
    0.744
  • Mean regret ΔWP (late & close) on NFL play-by-play (2016–2024)
    self-reported
    0.010