Ancient Greek Valence BERT

This is a pranaydeeps/Ancient-Greek-BERT model fine-tuned for valence regression on Ancient Greek texts. The model is designed not for classification, but to predict a continuous sentiment valence score ranging from -1.0 (most negative) to +1.0 (most positive).

This model was developed as part of a Ph.D. dissertation at Yonsei University, focusing on a sentiment analysis of Pauline epistles.

Model Description

The model is intended for academic use in Digital Humanities, Classics, and New Testament studies to analyze the sentiment polarity of texts in Koine and Homeric Greek. It takes a Greek sentence as input and returns a single regression score.

Training and Evaluation

Training Data

The model was trained on a custom-built corpus of 693 samples, which is a combination of two main sources:

Homeric Greek Dataset: The sentiment dataset from the Iliad, developed by Pavlopoulos et al. (2022).
New Testament (Koine Greek) Dataset: A new, bespoke corpus annotated by a panel of eight New Testament studies experts from Yonsei University. This smaller dataset was expanded using back-translation and generative augmentation techniques to balance the training pool.

All training data and scripts are available at the GitHub repository.

Training Procedure

The model was fine-tuned for a regression task using a Mean Squared Error (MSE) loss function. Key hyperparameters include a learning rate of 5e-5, a batch size of 64, and an AdamW optimizer. Training was performed with early stopping based on the Spearman correlation on a validation set.

Evaluation Results

The model's performance was evaluated on two separate, unseen test sets. The results demonstrate strong and consistent generalization across both domains.

Test Set	Pearson Correlation	Spearman Correlation
New Testament	0.661	0.648
Homeric	0.660	0.649

How to Use

You can use this model with the transformers library pipeline for sentiment analysis. Since this is a regression model, the output will be a raw score, not a "POSITIVE" or "NEGATIVE" label.

from transformers import pipeline

# Load the model from the Hub
valence_analyzer = pipeline("sentiment-analysis", model="luvnpce8d3/ancient-greek-valence-bert")

# Analyze an example sentence
text = "Μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος" # "Sing, Goddess, the wrath of Peleus' son Achilles"
result = valence_analyzer(text)

# The result is a score from -1.0 to 1.0
print(result)
# [{'label': 'LABEL_0', 'score': 0.375}] -> A positive valence score

luvnpce83
/

ancient-greek-valence-bert