🧠 Big Five Personality Regression Model

This model predicts Big Five personality traits — Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism — from English free-text inputs. The output is a set of five continuous values between 0.0 and 1.0, corresponding to each trait.


Model Details

Model Description

  • Developed by: vladinc
  • Model type: distilbert-base-uncased, fine-tuned
  • Language(s): English
  • License: MIT
  • Finetuned from model: distilbert-base-uncased
  • Trained on: ~8,700 essays from the jingjietan/essays-big5 dataset

Model Sources


Uses

Direct Use

This model can be used to estimate personality profiles from user-written text. It may be useful in psychological analysis, conversational profiling, or educational feedback systems.

Out-of-Scope Use

  • Not intended for clinical or diagnostic use.
  • Should not be used to make hiring, legal, or psychological decisions.
  • Not validated across cultures or demographic groups.

Bias, Risks, and Limitations

  • Trained on essay data; generalizability to tweets, messages, or other short-form texts may be limited.
  • Traits like Extraversion and Neuroticism had higher validation MSE, suggesting reduced predictive reliability.
  • Cultural and linguistic biases in training data may influence predictions.

Recommendations

Do not use predictions from this model in isolation. Supplement with human judgment and/or other assessment tools.


How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("vladinc/bigfive-regression-model")
tokenizer = AutoTokenizer.from_pretrained("vladinc/bigfive-regression-model")

text = "I enjoy reflecting on abstract concepts and trying new things."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

print(outputs.logits)  # 5 float scores between 0.0 and 1.0

Training Details
Training Data
Dataset: jingjietan/essays-big5

Format: Essay text + 5 numeric labels for personality traits

Training Procedure
Epochs: 3

Batch size: 8

Learning rate: 2e-5

Loss Function: Mean Squared Error

Metric for Best Model: MSE on Openness

Evaluation
Metrics
Trait	Validation MSE
Openness	0.324
Conscientiousness	0.537
Extraversion	0.680
Agreeableness	0.441
Neuroticism	0.564

Citation
If you use this model, please cite it:

BibTeX:

bibtex
Copy
Edit
@misc{vladinc2025bigfive,
  title={Big Five Personality Regression Model},
  author={vladinc},
  year={2025},
  howpublished={\\url{https://huggingface.co/vladinc/bigfive-regression-model}}
}
Contact
If you have questions or suggestions, feel free to reach out via the Hugging Face profile.
Downloads last month
155
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train vladinc/bigfive-regression-model