Sankofa XGBoost Regression Model
📖 Model Overview
The Sankofa Regression Model is part of the Soulprint Archetype System, designed to measure how strongly a given text reflects the values of the Sankofa archetype.
It uses SentenceTransformer embeddings (all-mpnet-base-v2
) as input features and an XGBoost regressor trained on a 1,000-row curated dataset.
- Architecture: SentenceTransformer embeddings + XGBoost regression
- Output Range: 0.0 → 1.0 (Sankofa alignment score)
- Training Size: 1,000 rows (balanced distribution)
🌍 What is Sankofa?
The Sankofa archetype emphasizes learning from the past, honoring ancestral wisdom, and applying history to guide future actions.
- High scores (0.7–1.0): Strong grounding in memory, reflection, and ancestral values
- Mid scores (0.4–0.6): Some awareness of the past but shallow or inconsistent application
- Low scores (0.0–0.3): Dismissal of history, impatience, or neglect of lessons from the past
📊 Training & Evaluation
Training Methodology:
- Inputs: Free-text statements
- Labels: Float scores (0.0 → 1.0) for Sankofa alignment
- Embeddings:
all-mpnet-base-v2
from SentenceTransformers - Model: XGBoost regressor
Results:
- MSE: 0.0143
- RMSE: 0.1198
- R²: 0.824
This means predictions are typically within ±0.12 of the true score, explaining 82% of dataset variance.
🚀 Intended Use
- Measuring alignment of text to Sankofa archetype values
- Research in Soulprint archetypes & culturally rooted AI models
- Applications in AI agents, storytelling systems, and reflective analysis tools
⚠️ Limitations
- The dataset is limited to 1,000 rows; performance could improve with more data.
- The model is specific to Sankofa and should not be generalized to other archetypes.
- Interpretability is dependent on the embedding model (
all-mpnet-base-v2
).
💡 Example Usage
import joblib
from sentence_transformers import SentenceTransformer
from huggingface_hub import hf_hub_download
# -----------------------------
# 1. Download model from Hugging Face Hub
# -----------------------------
REPO_ID = "mjpsm/Sankofa-xgb-model"
FILENAME = "Sankofa_xgb_model.pkl"
model_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)
# -----------------------------
# 2. Load model + embedder
# -----------------------------
model = joblib.load(model_path)
embedder = SentenceTransformer("all-mpnet-base-v2")
# -----------------------------
# 3. Example prediction
# -----------------------------
text = "The group studied old archives before planning, ensuring past mistakes were not repeated."
embedding = embedder.encode([text])
score = model.predict(embedding)[0]
print("Predicted Sankofa Score:", round(float(score), 3))
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Space using mjpsm/Sankofa-xgb-model 1
Evaluation results
- MSEself-reported0.014
- RMSEself-reported0.120
- R²self-reported0.824