mjpsm
/

Sankofa-xgb-model

Model card Files Files and versions

mjpsm commited on Sep 25

Commit

ba3dea3

·

verified ·

1 Parent(s): b9c9af3

Create README.md

Files changed (1) hide show

README.md +101 -0

README.md ADDED Viewed

	@@ -0,0 +1,101 @@

+---
+language: en
+license: mit
+tags:
+- regression
+- xgboost
+- embeddings
+- soulprint
+- sankofa
+datasets:
+- custom
+metrics:
+- mse
+- rmse
+- r2
+model-index:
+- name: Sankofa_xgb_model
+  results:
+    - task:
+        type: regression
+        name: Archetype Regression
+      metrics:
+        - name: MSE
+          type: mean_squared_error
+          value: 0.0143
+        - name: RMSE
+          type: root_mean_squared_error
+          value: 0.1198
+        - name: R²
+          type: r2_score
+          value: 0.824
+---
+# Sankofa XGBoost Regression Model
+## 📖 Model Overview
+The **Sankofa Regression Model** is part of the **Soulprint Archetype System**, designed to measure how strongly a given text reflects the values of the **Sankofa archetype**.
+It uses **SentenceTransformer embeddings** (`all-mpnet-base-v2`) as input features and an **XGBoost regressor** trained on a **1,000-row curated dataset**.
+- **Architecture**: SentenceTransformer embeddings + XGBoost regression
+- **Output Range**: 0.0 → 1.0 (Sankofa alignment score)
+- **Training Size**: 1,000 rows (balanced distribution)
+---
+## 🌍 What is Sankofa?
+The **Sankofa archetype** emphasizes **learning from the past, honoring ancestral wisdom, and applying history to guide future actions**.
+- **High scores (0.7–1.0)**: Strong grounding in memory, reflection, and ancestral values
+- **Mid scores (0.4–0.6)**: Some awareness of the past but shallow or inconsistent application
+- **Low scores (0.0–0.3)**: Dismissal of history, impatience, or neglect of lessons from the past
+---
+## 📊 Training & Evaluation
+**Training Methodology**:
+- Inputs: Free-text statements
+- Labels: Float scores (0.0 → 1.0) for Sankofa alignment
+- Embeddings: `all-mpnet-base-v2` from SentenceTransformers
+- Model: XGBoost regressor
+**Results**:
+- **MSE**: 0.0143
+- **RMSE**: 0.1198
+- **R²**: 0.824
+This means predictions are typically within ±0.12 of the true score, explaining **82% of dataset variance**.
+---
+## 🚀 Intended Use
+- Measuring **alignment of text to Sankofa archetype values**
+- Research in **Soulprint archetypes & culturally rooted AI models**
+- Applications in **AI agents, storytelling systems, and reflective analysis tools**
+---
+## ⚠️ Limitations
+- The dataset is **limited to 1,000 rows**; performance could improve with more data.
+- The model is **specific to Sankofa** and should not be generalized to other archetypes.
+- Interpretability is dependent on the embedding model (`all-mpnet-base-v2`).
+---
+## 💡 Example Usage
+```python
+import joblib
+from sentence_transformers import SentenceTransformer
+# Load model and embedder
+model = joblib.load("Sankofa_xgb_model.pkl")
+embedder = SentenceTransformer("all-mpnet-base-v2")
+# Example text
+text = "The community documented their struggles so future generations could learn."
+embedding = embedder.encode([text])
+score = model.predict(embedding)[0]
+print("Predicted Sankofa Score:", round(float(score), 3))
+```