Ancient Greek Valence BERT
This is a pranaydeeps/Ancient-Greek-BERT
model fine-tuned for valence regression on Ancient Greek texts. The model is designed not for classification, but to predict a continuous sentiment valence score ranging from -1.0 (most negative) to +1.0 (most positive).
This model was developed as part of a Ph.D. dissertation at Yonsei University, focusing on a sentiment analysis of Pauline epistles.
Model Description
The model is intended for academic use in Digital Humanities, Classics, and New Testament studies to analyze the sentiment polarity of texts in Koine and Homeric Greek. It takes a Greek sentence as input and returns a single regression score.
Training and Evaluation
Training Data
The model was trained on a custom-built corpus of 693 samples, which is a combination of two main sources:
- Homeric Greek Dataset: The sentiment dataset from the Iliad, developed by Pavlopoulos et al. (2022).
- New Testament (Koine Greek) Dataset: A new, bespoke corpus annotated by a panel of eight New Testament studies experts from Yonsei University. This smaller dataset was expanded using back-translation and generative augmentation techniques to balance the training pool.
All training data and scripts are available at the GitHub repository.
Training Procedure
The model was fine-tuned for a regression task using a Mean Squared Error (MSE) loss function. Key hyperparameters include a learning rate of 5e-5, a batch size of 64, and an AdamW optimizer. Training was performed with early stopping based on the Spearman correlation on a validation set.
Evaluation Results
The model's performance was evaluated on two separate, unseen test sets. The results demonstrate strong and consistent generalization across both domains.
Test Set | Pearson Correlation | Spearman Correlation |
---|---|---|
New Testament | 0.661 | 0.648 |
Homeric | 0.660 | 0.649 |
How to Use
You can use this model with the transformers
library pipeline for sentiment analysis. Since this is a regression model, the output will be a raw score, not a "POSITIVE" or "NEGATIVE" label.
from transformers import pipeline
# Load the model from the Hub
valence_analyzer = pipeline("sentiment-analysis", model="luvnpce8d3/ancient-greek-valence-bert")
# Analyze an example sentence
text = "Μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος" # "Sing, Goddess, the wrath of Peleus' son Achilles"
result = valence_analyzer(text)
# The result is a score from -1.0 to 1.0
print(result)
# [{'label': 'LABEL_0', 'score': 0.375}] -> A positive valence score
- Downloads last month
- 15
Model tree for luvnpce83/ancient-greek-valence-bert
Base model
pranaydeeps/Ancient-Greek-BERT