Skill Level Classifier (XGBoost, v2)

What it does: Predicts an entrepreneur’s skill levelLow, Medium, or High — from eight 1–10 numeric features.
Why it’s here: Compact, fast, and easy to deploy for tabular inference. Trained with early stopping; no target leakage.

Files in this repo

  • xgb_model_Skill_Level_v2.json — trained XGBoost model
  • feature_order_Skill_Level_v2.json — list of feature names (order-sensitive)
  • label_map_Skill_Level_v2.json — mapping from class name → index (don’t assume order)

Input features (1–10 scale)

All inputs are integers or floats in the range 1–10.

  • years_experience_score
  • education_training_score
  • execution_ability_score
  • problem_solving_score
  • confidence_score
  • idea_difficulty_score
  • leadership_score
  • networking_score

Feature definitions (what 1, 5, 10 roughly mean)

  • years_experience_score — Practical experience in entrepreneurial, business, or technical work.
    1: none · 5: ~3–4 years/moderate exposure · 10: 10+ years/expert-level track record

  • education_training_score — Formal or informal training related to business/entrepreneurship/tech.
    1: no training · 5: some courses/undergrad/bootcamps · 10: advanced degrees/certifications/ongoing education

  • execution_ability_score — Ability to independently complete tasks and ship projects.
    1: needs step-by-step guidance · 5: handles medium complexity with minimal help · 10: ships complex projects reliably, end-to-end

  • problem_solving_score — Adaptability and effectiveness at diagnosing/solving issues.
    1: gets stuck frequently · 5: resolves common issues with some effort · 10: quickly solves complex/novel problems

  • confidence_score — Self-efficacy and decisiveness applying skills in practice.
    1: not confident · 5: moderately confident · 10: highly confident and decisive under uncertainty

  • idea_difficulty_score — Complexity/ambition of the current venture or idea.
    1: very simple/small scope · 5: moderate (e.g., niche app) · 10: highly complex (e.g., AI/biotech/multi-sided marketplace)

  • leadership_score — Experience leading people, projects, or cross-functional efforts.
    1: no leadership experience · 5: led small teams/projects · 10: extensive leadership of large/complex teams

  • networking_score — Ability to leverage mentors, partnerships, and resources/funding.
    1: minimal network/isolated · 5: active connections and events · 10: strong network; partnerships/fundraising proficiency

Note: The dataset also contains a skill_level_readiness (1–10) field, but it is not used as an input in this v2 model to avoid target leakage.

Quickstart — Load from Hub & predict

# pip install xgboost pandas huggingface_hub

from huggingface_hub import hf_hub_download
from xgboost import XGBClassifier
import json, pandas as pd, numpy as np

REPO_ID = "mjpsm/Skill-Level-XGB-v2"  # change to your repo id if you fork

# Download artifacts
model_file    = hf_hub_download(REPO_ID, "xgb_model_Skill_Level_v2.json")
features_file = hf_hub_download(REPO_ID, "feature_order_Skill_Level_v2.json")
labelmap_file = hf_hub_download(REPO_ID, "label_map_Skill_Level_v2.json")

with open(features_file) as f: FEATURE_COLS = json.load(f)
with open(labelmap_file) as f: LABEL_MAP = json.load(f)      # e.g., {"High":0,"Low":1,"Medium":2}
INV = {v:k for k,v in LABEL_MAP.items()}

# Load model
clf = XGBClassifier()
clf.load_model(model_file)

# Single example
example = {
    "years_experience_score": 6,
    "education_training_score": 7,
    "execution_ability_score": 7,
    "problem_solving_score": 7,
    "confidence_score": 7,
    "idea_difficulty_score": 6,
    "leadership_score": 6,
    "networking_score": 6
}

X = pd.DataFrame([example], columns=FEATURE_COLS).astype("float32").values
probs = clf.predict_proba(X)[0]
pred  = int(np.argmax(probs))

print("Predicted class:", INV[pred])
print("Class probabilities:", {INV[i]: float(probs[i]) for i in range(len(probs))})
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

  • accuracy on skill_level_dataset_1998 (synthetic, balanced)
    self-reported
    0.993
  • macro_f1 on skill_level_dataset_1998 (synthetic, balanced)
    self-reported
    0.993
  • log_loss on skill_level_dataset_1998 (synthetic, balanced)
    self-reported
    0.034