Tamu-xgb-model / README.md
mjpsm's picture
Create README.md
df4177b verified
metadata
language: en
license: mit
tags:
  - regression
  - soulprint
  - tamu
  - xgboost
  - embeddings
datasets:
  - custom
metrics:
  - mse
  - r2
model-index:
  - name: Tamu-xgb-model
    results:
      - task:
          type: regression
          name: Predicting Tamu Scores
        dataset:
          name: Soulprint Tamu Dataset
          type: custom
          size: 912
        metrics:
          - name: MSE
            type: mse
            value: 0.0167
          - name: 
            type: r2
            value: 0.803

Tamu XGBoost Regression Model

Overview

The Tamu Regression Model is part of the Soulprint archetype system, designed to measure expressions of lightness, uplift, and shared resonance in text.
It was trained on a balanced dataset of 912 rows, evenly distributed across three continuous output bins:

  • Low (0.00–0.33): minimal energy, muted or subdued responses
  • Mid (0.34–0.66): moderate energy, rhythmic or collective responses
  • High (0.67–1.00): elevated energy, loud or vibrant expressions

The model outputs a continuous score between 0.00 and 1.00, where higher values correspond to stronger expressions of Tamu energy.


Training Details

  • Dataset size: 912 rows (balanced: 304 per bin)
  • Embedding model: sentence-transformers/all-mpnet-base-v2
  • Regressor: XGBoost Regressor (reg:squarederror)
  • Metrics achieved:
    • MSE: 0.0167
    • R²: 0.803

Usage

Inference Example

import xgboost as xgb
from sentence_transformers import SentenceTransformer
from huggingface_hub import hf_hub_download

# -----------------------------
# 1. Download model from Hugging Face Hub
# -----------------------------
REPO_ID = "mjpsm/Tamu-xgb-model"
FILENAME = "Tamu_xgb_model.json"

model_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)

# -----------------------------
# 2. Load model + embedder
# -----------------------------
model = xgb.XGBRegressor()
model.load_model(model_path)

embedder = SentenceTransformer("all-mpnet-base-v2")

# -----------------------------
# 3. Example prediction
# -----------------------------
text = "Inside the library, the pages turned slowly as students whispered."
embedding = embedder.encode([text])
score = model.predict(embedding)[0]

print("Predicted Tamu Score:", round(float(score), 3))