Entrepreneurial Readiness Classifier (XGBoost, v2)
3-class tabular classifier that predicts entrepreneurial readiness β low
, medium
, or high
β from numeric features describing finances, capacity, experience, and mindset.
This is an XGBoost model (not a Transformer). Itβs published here for easy access via the Hugging Face Hub.
Files in this repo
xgb_model.json
β XGBoost booster (JSON format)feature_order.json
β ordered list of input feature names (the model expects this order)label_map.json
β mapping{ "low": 0, "medium": 1, "high": 2 }
Input features (and meaning)
The model expects all of the following numeric features (see feature_order.json
for the exact order):
savings
β liquid savings in dollarsmonthly_income
β monthly income ($)monthly_bills
β monthly fixed expenses ($)monthly_entertainment_spend
β discretionary monthly spend ($)sales_skills_1to10
β self-rated sales skills (1β10)age
β yearsdependents_count
β number of dependentsassets
β approximate assets value ($)risk_tolerance_1to10
β self-rated risk tolerance (1β10)confidence_1to10
β self-rated confidence (1β10)idea_difficulty_1to10
β perceived idea difficulty (1β10; higher = harder)runway_months
β months of financial runway (capped at 60 if net burn β€ 0)savings_to_expense_ratio
βsavings / (monthly_bills + monthly_entertainment_spend)
(capped at 12)prior_businesses_started_
β count of prior startups/venturesprior_exits
β count of prior exitstime_available_hours_per_week
β weekly time available for the venture
β οΈ Important: The model was trained with
runway_months
andsavings_to_expense_ratio
already computed. For best results, provide these two features at inference time using the same logic.
Quick start (Python)
Install
pip install xgboost pandas huggingface_hub
### Load from the Hub and predict
```python
from huggingface_hub import hf_hub_download
from xgboost import XGBClassifier
import json, pandas as pd
REPO_ID = "mjpsm/Entrepreneurial-Readiness-XGB-v2"
# Download artifacts
model_file = hf_hub_download(REPO_ID, "xgb_model.json")
feat_file = hf_hub_download(REPO_ID, "feature_order.json")
map_file = hf_hub_download(REPO_ID, "label_map.json")
# Load model + metadata
clf = XGBClassifier(); clf.load_model(model_file)
feature_order = json.load(open(feat_file))
label_map = json.load(open(map_file))
inv_map = {v: k for k, v in label_map.items()}
# (Optional) helper if you don't provide the two derived features
def add_derived(r):
bills = float(r["monthly_bills"])
ent = float(r["monthly_entertainment_spend"])
income= float(r["monthly_income"])
savings = float(r["savings"])
# Ratio (cap at 12, matching training)
denom = bills + ent
r["savings_to_expense_ratio"] = min(12.0, (savings / denom) if denom > 0 else 12.0)
# Runway (cap at 60, matching training)
net_burn = (bills + ent) - income
r["runway_months"] = 60.0 if net_burn <= 0 else max(0.0, min(60.0, savings / net_burn))
return r
# Example row (numbers are illustrative)
row = {
"savings": 8500.0,
"monthly_income": 4200.0,
"monthly_bills": 3100.0,
"monthly_entertainment_spend": 200.0,
"sales_skills_1to10": 7,
"age": 29,
"dependents_count": 0,
"assets": 15000.0,
"risk_tolerance_1to10": 7,
"confidence_1to10": 8,
"idea_difficulty_1to10": 5,
"prior_businesses_started_": 1,
"prior_exits": 0,
"time_available_hours_per_week": 35
}
# If you didn't compute the derived features already, add them:
row = add_derived(row)
# Predict
X = pd.DataFrame([row])[feature_order]
pred_id = int(clf.predict(X)[0])
probs = clf.predict_proba(X)[0].tolist()
print("prediction:", inv_map[pred_id])
print("probs:", {inv_map[i]: round(p, 4) for i, p in enumerate(probs)})
Evaluation results
- macro_f1 on entrepreneurial_readiness_v2 (balanced)self-reported0.954
- log_loss on entrepreneurial_readiness_v2 (balanced)self-reported0.122