Update README.md
Browse files
README.md
CHANGED
@@ -20,9 +20,9 @@ base_model:
|
|
20 |
|
21 |
# ru-summary-quality-metric
|
22 |
|
23 |
-
This model is a fine-tuned version of [`ai-forever/ruT5-large`](https://huggingface.co/ai-forever/ruT5-large), trained for binary quality assessment of summaries
|
24 |
|
25 |
-
**Important:**
|
26 |
|
27 |
## Data and Training Metric
|
28 |
|
@@ -30,12 +30,12 @@ The model was fine-tuned on [SEAHORSE](https://huggingface.co/datasets/hgissbkh/
|
|
30 |
|
31 |
This specific model focuses on Q6 Conciseness metric. According to SEAHORSE paper authors, Q6 is considered one of the most high-level and challenging quality metrics.
|
32 |
|
33 |
-
* **Training Data:** `ru` and `en` subsets training split, filtered for `conciseness` labels.
|
34 |
* **Evaluation Data:** only `ru` subset of validation and test splits.
|
35 |
|
36 |
## Evaluation Results
|
37 |
|
38 |
-
|Test
|
39 |
|-|-|-|
|
40 |
|All|0.479|0.792|
|
41 |
|≥ 20 summary words |0.459|0.781|
|
@@ -85,8 +85,7 @@ def predict_conciseness_score(text, summary, tokenizer, model, device, zero_toke
|
|
85 |
logit_0 = first_token_logits[zero_token_id]
|
86 |
logit_1 = first_token_logits[one_token_id]
|
87 |
|
88 |
-
|
89 |
-
probability_of_one = torch.sigmoid(torch.tensor(score_diff)).item()
|
90 |
|
91 |
return probability_of_one
|
92 |
```
|
|
|
20 |
|
21 |
# ru-summary-quality-metric
|
22 |
|
23 |
+
This model is a fine-tuned version of [`ai-forever/ruT5-large`](https://huggingface.co/ai-forever/ruT5-large), was trained for binary quality assessment of Russian summaries when paired with their original texts.
|
24 |
|
25 |
+
**Important:** model uses a non-standard approach, adapting a Seq2Seq model for a binary classification task. It was trained to predict a specific token as the target sequence. This approach directly follows the methodology used by the authors of the original SEAHORSE paper.
|
26 |
|
27 |
## Data and Training Metric
|
28 |
|
|
|
30 |
|
31 |
This specific model focuses on Q6 Conciseness metric. According to SEAHORSE paper authors, Q6 is considered one of the most high-level and challenging quality metrics.
|
32 |
|
33 |
+
* **Training Data:** `ru` and `en` subsets of training split, filtered for `conciseness` labels.
|
34 |
* **Evaluation Data:** only `ru` subset of validation and test splits.
|
35 |
|
36 |
## Evaluation Results
|
37 |
|
38 |
+
|Test set|Pearson Correlation|ROC AUC|
|
39 |
|-|-|-|
|
40 |
|All|0.479|0.792|
|
41 |
|≥ 20 summary words |0.459|0.781|
|
|
|
85 |
logit_0 = first_token_logits[zero_token_id]
|
86 |
logit_1 = first_token_logits[one_token_id]
|
87 |
|
88 |
+
probability_of_one = torch.sigmoid(logit_1 - logit_0).item()
|
|
|
89 |
|
90 |
return probability_of_one
|
91 |
```
|