Entrepreneurial Readiness Classifier (XGBoost, v2)

3-class tabular classifier that predicts entrepreneurial readiness β€” low, medium, or high β€” from numeric features describing finances, capacity, experience, and mindset.

This is an XGBoost model (not a Transformer). It’s published here for easy access via the Hugging Face Hub.


Files in this repo

  • xgb_model.json β€” XGBoost booster (JSON format)
  • feature_order.json β€” ordered list of input feature names (the model expects this order)
  • label_map.json β€” mapping { "low": 0, "medium": 1, "high": 2 }

Input features (and meaning)

The model expects all of the following numeric features (see feature_order.json for the exact order):

  1. savings β€” liquid savings in dollars
  2. monthly_income β€” monthly income ($)
  3. monthly_bills β€” monthly fixed expenses ($)
  4. monthly_entertainment_spend β€” discretionary monthly spend ($)
  5. sales_skills_1to10 β€” self-rated sales skills (1–10)
  6. age β€” years
  7. dependents_count β€” number of dependents
  8. assets β€” approximate assets value ($)
  9. risk_tolerance_1to10 β€” self-rated risk tolerance (1–10)
  10. confidence_1to10 β€” self-rated confidence (1–10)
  11. idea_difficulty_1to10 β€” perceived idea difficulty (1–10; higher = harder)
  12. runway_months β€” months of financial runway (capped at 60 if net burn ≀ 0)
  13. savings_to_expense_ratio β€” savings / (monthly_bills + monthly_entertainment_spend) (capped at 12)
  14. prior_businesses_started_ β€” count of prior startups/ventures
  15. prior_exits β€” count of prior exits
  16. time_available_hours_per_week β€” weekly time available for the venture

⚠️ Important: The model was trained with runway_months and savings_to_expense_ratio already computed. For best results, provide these two features at inference time using the same logic.


Quick start (Python)

Install

pip install xgboost pandas huggingface_hub

### Load from the Hub and predict

```python
from huggingface_hub import hf_hub_download
from xgboost import XGBClassifier
import json, pandas as pd

REPO_ID = "mjpsm/Entrepreneurial-Readiness-XGB-v2"

# Download artifacts
model_file = hf_hub_download(REPO_ID, "xgb_model.json")
feat_file  = hf_hub_download(REPO_ID, "feature_order.json")
map_file   = hf_hub_download(REPO_ID, "label_map.json")

# Load model + metadata
clf = XGBClassifier(); clf.load_model(model_file)
feature_order = json.load(open(feat_file))
label_map = json.load(open(map_file))
inv_map = {v: k for k, v in label_map.items()}

# (Optional) helper if you don't provide the two derived features
def add_derived(r):
    bills = float(r["monthly_bills"])
    ent   = float(r["monthly_entertainment_spend"])
    income= float(r["monthly_income"])
    savings = float(r["savings"])
    # Ratio (cap at 12, matching training)
    denom = bills + ent
    r["savings_to_expense_ratio"] = min(12.0, (savings / denom) if denom > 0 else 12.0)
    # Runway (cap at 60, matching training)
    net_burn = (bills + ent) - income
    r["runway_months"] = 60.0 if net_burn <= 0 else max(0.0, min(60.0, savings / net_burn))
    return r

# Example row (numbers are illustrative)
row = {
  "savings": 8500.0,
  "monthly_income": 4200.0,
  "monthly_bills": 3100.0,
  "monthly_entertainment_spend": 200.0,
  "sales_skills_1to10": 7,
  "age": 29,
  "dependents_count": 0,
  "assets": 15000.0,
  "risk_tolerance_1to10": 7,
  "confidence_1to10": 8,
  "idea_difficulty_1to10": 5,
  "prior_businesses_started_": 1,
  "prior_exits": 0,
  "time_available_hours_per_week": 35
}

# If you didn't compute the derived features already, add them:
row = add_derived(row)

# Predict
X = pd.DataFrame([row])[feature_order]
pred_id = int(clf.predict(X)[0])
probs = clf.predict_proba(X)[0].tolist()

print("prediction:", inv_map[pred_id])
print("probs:", {inv_map[i]: round(p, 4) for i, p in enumerate(probs)})
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results

  • macro_f1 on entrepreneurial_readiness_v2 (balanced)
    self-reported
    0.954
  • log_loss on entrepreneurial_readiness_v2 (balanced)
    self-reported
    0.122