luvnpce83 commited on
Commit
07522d3
·
verified ·
1 Parent(s): 1567f14

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -3
README.md CHANGED
@@ -1,3 +1,67 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - ancient-greek
8
+ - koine-greek
9
+ - sentiment-analysis
10
+ - regression
11
+ - digital-humanities
12
+ base_model: pranaydeeps/Ancient-Greek-BERT
13
+ datasets:
14
+ - custom
15
+ ---
16
+
17
+ # Ancient Greek Valence BERT
18
+
19
+ This is a `pranaydeeps/Ancient-Greek-BERT` model fine-tuned for valence regression on Ancient Greek texts. The model is designed not for classification, but to predict a continuous sentiment **valence score** ranging from **-1.0 (most negative) to +1.0 (most positive)**.
20
+
21
+ This model was developed as part of a Ph.D. dissertation at Yonsei University, focusing on a sentiment analysis of Pauline epistles.
22
+
23
+ ## Model Description
24
+
25
+ The model is intended for academic use in Digital Humanities, Classics, and New Testament studies to analyze the sentiment polarity of texts in Koine and Homeric Greek. It takes a Greek sentence as input and returns a single regression score.
26
+
27
+ ## Training and Evaluation
28
+
29
+ ### Training Data
30
+
31
+ The model was trained on a custom-built corpus of 693 samples, which is a combination of two main sources:
32
+ 1. **Homeric Greek Dataset**: The sentiment dataset from the *Iliad*, developed by Pavlopoulos et al. (2022).
33
+ 2. **New Testament (Koine Greek) Dataset**: A new, bespoke corpus annotated by a panel of eight New Testament studies experts from Yonsei University. This smaller dataset was expanded using back-translation and generative augmentation techniques to balance the training pool.
34
+
35
+ All training data and scripts are available at the [GitHub repository](https://github.com/luvnpce83/koine-greek-sentiment-analysis).
36
+
37
+ ### Training Procedure
38
+
39
+ The model was fine-tuned for a regression task using a Mean Squared Error (MSE) loss function. Key hyperparameters include a learning rate of 5e-5, a batch size of 32, and an AdamW optimizer. Training was performed with early stopping based on the Spearman correlation on a validation set.
40
+
41
+ ### Evaluation Results
42
+
43
+ The model's performance was evaluated on two separate, unseen test sets. The results demonstrate strong and consistent generalization across both domains.
44
+
45
+ | Test Set | Pearson Correlation | Spearman Correlation |
46
+ |-----------------|---------------------|----------------------|
47
+ | New Testament | 0.643 | 0.629 |
48
+ | Homeric | 0.639 | 0.628 |
49
+
50
+
51
+ ## How to Use
52
+
53
+ You can use this model with the `transformers` library pipeline for sentiment analysis. Since this is a regression model, the output will be a raw score, not a "POSITIVE" or "NEGATIVE" label.
54
+
55
+ ```python
56
+ from transformers import pipeline
57
+
58
+ # Load the model from the Hub
59
+ valence_analyzer = pipeline("sentiment-analysis", model="luvnpce8d3/ancient-greek-valence-bert")
60
+
61
+ # Analyze an example sentence
62
+ text = "Μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος" # "Sing, Goddess, the wrath of Peleus' son Achilles"
63
+ result = valence_analyzer(text)
64
+
65
+ # The result is a score from -1.0 to 1.0
66
+ print(result)
67
+ # [{'label': 'LABEL_0', 'score': 0.375}] -> A positive valence score