🧠 Big Five Personality Regression Model

This model predicts Big Five personality traits — Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism — from English free-text inputs. The output is a set of five continuous values between 0.0 and 1.0, corresponding to each trait.

Model Details

Model Description

Developed by: vladinc
Model type: distilbert-base-uncased, fine-tuned
Language(s): English
License: MIT
Finetuned from model: distilbert-base-uncased
Trained on: ~8,700 essays from the jingjietan/essays-big5 dataset

Model Sources

Repository: https://huggingface.co/vladinc/bigfive-regression-model

Uses

Direct Use

This model can be used to estimate personality profiles from user-written text. It may be useful in psychological analysis, conversational profiling, or educational feedback systems.

Out-of-Scope Use

Not intended for clinical or diagnostic use.
Should not be used to make hiring, legal, or psychological decisions.
Not validated across cultures or demographic groups.

Bias, Risks, and Limitations

Trained on essay data; generalizability to tweets, messages, or other short-form texts may be limited.
Traits like Extraversion and Neuroticism had higher validation MSE, suggesting reduced predictive reliability.
Cultural and linguistic biases in training data may influence predictions.

Recommendations

Do not use predictions from this model in isolation. Supplement with human judgment and/or other assessment tools.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("vladinc/bigfive-regression-model")
tokenizer = AutoTokenizer.from_pretrained("vladinc/bigfive-regression-model")

text = "I enjoy reflecting on abstract concepts and trying new things."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

print(outputs.logits)  # 5 float scores between 0.0 and 1.0

Training Details
Training Data
Dataset: jingjietan/essays-big5

Format: Essay text + 5 numeric labels for personality traits

Training Procedure
Epochs: 3

Batch size: 8

Learning rate: 2e-5

Loss Function: Mean Squared Error

Metric for Best Model: MSE on Openness

Evaluation
Metrics
Trait	Validation MSE
Openness	0.324
Conscientiousness	0.537
Extraversion	0.680
Agreeableness	0.441
Neuroticism	0.564

Citation
If you use this model, please cite it:

BibTeX:

bibtex
Copy
Edit
@misc{vladinc2025bigfive,
  title={Big Five Personality Regression Model},
  author={vladinc},
  year={2025},
  howpublished={\\url{https://huggingface.co/vladinc/bigfive-regression-model}}
}
Contact
If you have questions or suggestions, feel free to reach out via the Hugging Face profile.

vladinc
/

bigfive-regression-model