Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ language:
|
|
8 |
metrics:
|
9 |
- accuracy
|
10 |
base_model:
|
11 |
-
- microsoft/deberta-large
|
12 |
pipeline_tag: token-classification
|
13 |
---
|
14 |
|
@@ -22,12 +22,12 @@ pipeline_tag: token-classification
|
|
22 |
|
23 |
<!-- Provide a longer summary of what this model is. -->
|
24 |
|
25 |
-
This model is a grammar error correction (GEC) system fine-tuned from the `microsoft/deberta-large` model, designed to detect and correct grammatical errors in English text. The model focuses on common grammatical mistakes such as verb tense, noun inflection, adjective usage, and more. It is particularly useful for language learners or applications requiring enhanced grammatical precision.
|
26 |
|
27 |
- **Model type:** Token classification with sequence-to-sequence correction
|
28 |
- **Language(s) (NLP):** English
|
29 |
- **License:** [More Information Needed]
|
30 |
-
- **Finetuned from model [optional]:** `microsoft/deberta-large`
|
31 |
|
32 |
|
33 |
## Uses
|
@@ -79,9 +79,9 @@ from torch import nn
|
|
79 |
from torch.nn import CrossEntropyLoss
|
80 |
from transformers import AutoConfig, AutoTokenizer
|
81 |
from transformers.file_utils import ModelOutput
|
82 |
-
from transformers.models.
|
83 |
-
|
84 |
-
|
85 |
)
|
86 |
|
87 |
|
@@ -103,7 +103,7 @@ class XGECToROutput(ModelOutput):
|
|
103 |
attentions: Optional[Tuple[torch.FloatTensor]] = None
|
104 |
|
105 |
|
106 |
-
class
|
107 |
"""
|
108 |
This class overrides the GECToR model to include an error detection head in addition to the token classification head.
|
109 |
"""
|
@@ -116,7 +116,7 @@ class XGECToRDeberta(DebertaPreTrainedModel):
|
|
116 |
self.num_labels = config.num_labels
|
117 |
self.unk_tag_idx = config.label2id.get("@@UNKNOWN@@", None)
|
118 |
|
119 |
-
self.deberta =
|
120 |
|
121 |
self.classifier = nn.Linear(config.hidden_size, config.num_labels)
|
122 |
|
@@ -249,5 +249,5 @@ The primary evaluation metric used was F0.5, measuring the model's ability to id
|
|
249 |
|
250 |
### Results
|
251 |
|
252 |
-
F0.5 =
|
253 |
|
|
|
8 |
metrics:
|
9 |
- accuracy
|
10 |
base_model:
|
11 |
+
- microsoft/deberta-v3-large
|
12 |
pipeline_tag: token-classification
|
13 |
---
|
14 |
|
|
|
22 |
|
23 |
<!-- Provide a longer summary of what this model is. -->
|
24 |
|
25 |
+
This model is a grammar error correction (GEC) system fine-tuned from the `microsoft/deberta-v3-large` model, designed to detect and correct grammatical errors in English text. The model focuses on common grammatical mistakes such as verb tense, noun inflection, adjective usage, and more. It is particularly useful for language learners or applications requiring enhanced grammatical precision.
|
26 |
|
27 |
- **Model type:** Token classification with sequence-to-sequence correction
|
28 |
- **Language(s) (NLP):** English
|
29 |
- **License:** [More Information Needed]
|
30 |
+
- **Finetuned from model [optional]:** `microsoft/deberta-v3-large`
|
31 |
|
32 |
|
33 |
## Uses
|
|
|
79 |
from torch.nn import CrossEntropyLoss
|
80 |
from transformers import AutoConfig, AutoTokenizer
|
81 |
from transformers.file_utils import ModelOutput
|
82 |
+
from transformers.models.deberta_v2.modeling_deberta_v2 import (
|
83 |
+
DebertaV2Model,
|
84 |
+
DebertaV2PreTrainedModel,
|
85 |
)
|
86 |
|
87 |
|
|
|
103 |
attentions: Optional[Tuple[torch.FloatTensor]] = None
|
104 |
|
105 |
|
106 |
+
class XGECToRDebertaV3(DebertaV2PreTrainedModel):
|
107 |
"""
|
108 |
This class overrides the GECToR model to include an error detection head in addition to the token classification head.
|
109 |
"""
|
|
|
116 |
self.num_labels = config.num_labels
|
117 |
self.unk_tag_idx = config.label2id.get("@@UNKNOWN@@", None)
|
118 |
|
119 |
+
self.deberta = DebertaV2Model(config)
|
120 |
|
121 |
self.classifier = nn.Linear(config.hidden_size, config.num_labels)
|
122 |
|
|
|
249 |
|
250 |
### Results
|
251 |
|
252 |
+
F0.5 = 74.61
|
253 |
|