Sankofa XGBoost Regression Model

📖 Model Overview

The Sankofa Regression Model is part of the Soulprint Archetype System, designed to measure how strongly a given text reflects the values of the Sankofa archetype.
It uses SentenceTransformer embeddings (all-mpnet-base-v2) as input features and an XGBoost regressor trained on a 1,000-row curated dataset.

  • Architecture: SentenceTransformer embeddings + XGBoost regression
  • Output Range: 0.0 → 1.0 (Sankofa alignment score)
  • Training Size: 1,000 rows (balanced distribution)

🌍 What is Sankofa?

The Sankofa archetype emphasizes learning from the past, honoring ancestral wisdom, and applying history to guide future actions.

  • High scores (0.7–1.0): Strong grounding in memory, reflection, and ancestral values
  • Mid scores (0.4–0.6): Some awareness of the past but shallow or inconsistent application
  • Low scores (0.0–0.3): Dismissal of history, impatience, or neglect of lessons from the past

📊 Training & Evaluation

Training Methodology:

  • Inputs: Free-text statements
  • Labels: Float scores (0.0 → 1.0) for Sankofa alignment
  • Embeddings: all-mpnet-base-v2 from SentenceTransformers
  • Model: XGBoost regressor

Results:

  • MSE: 0.0143
  • RMSE: 0.1198
  • : 0.824

This means predictions are typically within ±0.12 of the true score, explaining 82% of dataset variance.


🚀 Intended Use

  • Measuring alignment of text to Sankofa archetype values
  • Research in Soulprint archetypes & culturally rooted AI models
  • Applications in AI agents, storytelling systems, and reflective analysis tools

⚠️ Limitations

  • The dataset is limited to 1,000 rows; performance could improve with more data.
  • The model is specific to Sankofa and should not be generalized to other archetypes.
  • Interpretability is dependent on the embedding model (all-mpnet-base-v2).

💡 Example Usage

import joblib
from sentence_transformers import SentenceTransformer
from huggingface_hub import hf_hub_download

# -----------------------------
# 1. Download model from Hugging Face Hub
# -----------------------------
REPO_ID = "mjpsm/Sankofa-xgb-model"
FILENAME = "Sankofa_xgb_model.pkl"

model_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)

# -----------------------------
# 2. Load model + embedder
# -----------------------------
model = joblib.load(model_path)
embedder = SentenceTransformer("all-mpnet-base-v2")

# -----------------------------
# 3. Example prediction
# -----------------------------
text = "The group studied old archives before planning, ensuring past mistakes were not repeated."
embedding = embedder.encode([text])
score = model.predict(embedding)[0]

print("Predicted Sankofa Score:", round(float(score), 3))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using mjpsm/Sankofa-xgb-model 1