Support vector machine classifiers (one per OCEAN trait) trained with Qwen3-8B embeddings of Pennebaker and King's Essays dataset.

The training was conducted using test_size=0.2 and random_state=42.

Accuracy metrics:

  • Openness (O): 65.6%
  • Conscientiousness (C): 58.9%
  • Extraversion (E): 60.5%
  • Agreeableness (A): 61.1%
  • Neuroticism (N): 57.9%
  • Mean accuracy: 60.81%

Download all .joblib files and use like:

from sentence_transformers import SentenceTransformer
from joblib import load
import numpy as np


embedder = SentenceTransformer("Qwen/Qwen3-Embedding-8B", device='cuda')
traits = ['o', 'c', 'e', 'a', 'n']
classifiers = {trait: load(f"models_qwen3_8b/{trait}.joblib") for trait in traits}


def predict_personality(texts, batch_size=8):
    embeddings = embedder.encode(
        texts,
        batch_size=batch_size,
        convert_to_numpy=True,
        normalize_embeddings=True,
        show_progress_bar=True
    )
    results = []
    for embedding in embeddings:
        scores = []
        for trait in traits:
            pred = classifiers[trait].predict(embedding.reshape(1, -1))[0]
            scores.append(int(pred))
        results.append(scores)
    return results


texts = [
   "I enjoy working in solitude and reflecting deeply.",
   "I love going out, meeting new people, and trying new things!"
]
predictions = predict_personality(texts)
for text, profile in zip(texts, predictions):
   print(f"\nText: {text}\nOCEAN: {profile}")

We ran the script using an A40 with 32GB of VRAM. The embedding process supports up to 8 texts at a time with this setup.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support