luvnpce83
/

ancient-greek-valence-bert

+---
+language:
+- en
+license: apache-2.0
+library_name: transformers
+tags:
+- ancient-greek
+- koine-greek
+- sentiment-analysis
+- regression
+- digital-humanities
+base_model: pranaydeeps/Ancient-Greek-BERT
+datasets:
+- custom
+---
+# Ancient Greek Valence BERT
+This is a `pranaydeeps/Ancient-Greek-BERT` model fine-tuned for valence regression on Ancient Greek texts. The model is designed not for classification, but to predict a continuous sentiment **valence score** ranging from **-1.0 (most negative) to +1.0 (most positive)**.
+This model was developed as part of a Ph.D. dissertation at Yonsei University, focusing on a sentiment analysis of Pauline epistles.
+## Model Description
+The model is intended for academic use in Digital Humanities, Classics, and New Testament studies to analyze the sentiment polarity of texts in Koine and Homeric Greek. It takes a Greek sentence as input and returns a single regression score.
+## Training and Evaluation
+### Training Data
+The model was trained on a custom-built corpus of 693 samples, which is a combination of two main sources:
+1.  **Homeric Greek Dataset**: The sentiment dataset from the *Iliad*, developed by Pavlopoulos et al. (2022).
+2.  **New Testament (Koine Greek) Dataset**: A new, bespoke corpus annotated by a panel of eight New Testament studies experts from Yonsei University. This smaller dataset was expanded using back-translation and generative augmentation techniques to balance the training pool.
+All training data and scripts are available at the [GitHub repository](https://github.com/luvnpce83/koine-greek-sentiment-analysis).
+### Training Procedure
+The model was fine-tuned for a regression task using a Mean Squared Error (MSE) loss function. Key hyperparameters include a learning rate of 5e-5, a batch size of 32, and an AdamW optimizer. Training was performed with early stopping based on the Spearman correlation on a validation set.
+### Evaluation Results
+The model's performance was evaluated on two separate, unseen test sets. The results demonstrate strong and consistent generalization across both domains.
+| Test Set        | Pearson Correlation | Spearman Correlation |
+|-----------------|---------------------|----------------------|
+| New Testament   | 0.643               | 0.629                |
+| Homeric         | 0.639               | 0.628                |
+## How to Use
+You can use this model with the `transformers` library pipeline for sentiment analysis. Since this is a regression model, the output will be a raw score, not a "POSITIVE" or "NEGATIVE" label.
+```python
+from transformers import pipeline
+# Load the model from the Hub
+valence_analyzer = pipeline("sentiment-analysis", model="luvnpce8d3/ancient-greek-valence-bert")
+# Analyze an example sentence
+text = "Μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος" # "Sing, Goddess, the wrath of Peleus' son Achilles"
+result = valence_analyzer(text)
+# The result is a score from -1.0 to 1.0
+print(result)
+# [{'label': 'LABEL_0', 'score': 0.375}] -> A positive valence score