Support vector machine classifiers (one per OCEAN trait) trained with Qwen3-8B embeddings of Pennebaker and King's Essays dataset.
The training was conducted using test_size=0.2
and random_state=42
.
Accuracy metrics:
- Openness (O): 65.6%
- Conscientiousness (C): 58.9%
- Extraversion (E): 60.5%
- Agreeableness (A): 61.1%
- Neuroticism (N): 57.9%
- Mean accuracy: 60.81%
Download all .joblib
files and use like:
from sentence_transformers import SentenceTransformer
from joblib import load
import numpy as np
embedder = SentenceTransformer("Qwen/Qwen3-Embedding-8B", device='cuda')
traits = ['o', 'c', 'e', 'a', 'n']
classifiers = {trait: load(f"models_qwen3_8b/{trait}.joblib") for trait in traits}
def predict_personality(texts, batch_size=8):
embeddings = embedder.encode(
texts,
batch_size=batch_size,
convert_to_numpy=True,
normalize_embeddings=True,
show_progress_bar=True
)
results = []
for embedding in embeddings:
scores = []
for trait in traits:
pred = classifiers[trait].predict(embedding.reshape(1, -1))[0]
scores.append(int(pred))
results.append(scores)
return results
texts = [
"I enjoy working in solitude and reflecting deeply.",
"I love going out, meeting new people, and trying new things!"
]
predictions = predict_personality(texts)
for text, profile in zip(texts, predictions):
print(f"\nText: {text}\nOCEAN: {profile}")
We ran the script using an A40 with 32GB of VRAM. The embedding process supports up to 8 texts at a time with this setup.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support