--- language: - en license: apache-2.0 library_name: transformers tags: - ancient-greek - koine-greek - sentiment-analysis - regression - digital-humanities base_model: pranaydeeps/Ancient-Greek-BERT datasets: - custom --- # Ancient Greek Valence BERT This is a `pranaydeeps/Ancient-Greek-BERT` model fine-tuned for valence regression on Ancient Greek texts. The model is designed not for classification, but to predict a continuous sentiment **valence score** ranging from **-1.0 (most negative) to +1.0 (most positive)**. This model was developed as part of a Ph.D. dissertation at Yonsei University, focusing on a sentiment analysis of Pauline epistles. ## Model Description The model is intended for academic use in Digital Humanities, Classics, and New Testament studies to analyze the sentiment polarity of texts in Koine and Homeric Greek. It takes a Greek sentence as input and returns a single regression score. ## Training and Evaluation ### Training Data The model was trained on a custom-built corpus of 693 samples, which is a combination of two main sources: 1. **Homeric Greek Dataset**: The sentiment dataset from the *Iliad*, developed by Pavlopoulos et al. (2022). 2. **New Testament (Koine Greek) Dataset**: A new, bespoke corpus annotated by a panel of eight New Testament studies experts from Yonsei University. This smaller dataset was expanded using back-translation and generative augmentation techniques to balance the training pool. All training data and scripts are available at the [GitHub repository](https://github.com/luvnpce83/koine-greek-sentiment-analysis). ### Training Procedure The model was fine-tuned for a regression task using a Mean Squared Error (MSE) loss function. Key hyperparameters include a learning rate of 5e-5, a batch size of 64, and an AdamW optimizer. Training was performed with early stopping based on the Spearman correlation on a validation set. ### Evaluation Results The model's performance was evaluated on two separate, unseen test sets. The results demonstrate strong and consistent generalization across both domains. | Test Set | Pearson Correlation | Spearman Correlation | |-----------------|---------------------|----------------------| | New Testament | 0.661 | 0.648 | | Homeric | 0.660 | 0.649 | ## How to Use You can use this model with the `transformers` library pipeline for sentiment analysis. Since this is a regression model, the output will be a raw score, not a "POSITIVE" or "NEGATIVE" label. ```python from transformers import pipeline # Load the model from the Hub valence_analyzer = pipeline("sentiment-analysis", model="luvnpce8d3/ancient-greek-valence-bert") # Analyze an example sentence text = "Μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος" # "Sing, Goddess, the wrath of Peleus' son Achilles" result = valence_analyzer(text) # The result is a score from -1.0 to 1.0 print(result) # [{'label': 'LABEL_0', 'score': 0.375}] -> A positive valence score