longhoang06
commited on
Commit
•
cd91726
1
Parent(s):
7af081a
Update README.md
Browse files
README.md
CHANGED
@@ -11,11 +11,12 @@ should probably proofread and complete it, then remove this comment. -->
|
|
11 |
|
12 |
# fine-tuned-viquad-hgf
|
13 |
|
14 |
-
This model is a fine-tuned version of [bhavikardeshna/xlm-roberta-base-vietnamese](https://huggingface.co/bhavikardeshna/xlm-roberta-base-vietnamese) on the
|
15 |
|
16 |
## Model description
|
17 |
|
18 |
-
|
|
|
19 |
|
20 |
## Intended uses & limitations
|
21 |
|
@@ -23,7 +24,8 @@ More information needed
|
|
23 |
|
24 |
## Training and evaluation data
|
25 |
|
26 |
-
|
|
|
27 |
|
28 |
## Training procedure
|
29 |
|
|
|
11 |
|
12 |
# fine-tuned-viquad-hgf
|
13 |
|
14 |
+
This model is a fine-tuned version of [bhavikardeshna/xlm-roberta-base-vietnamese](https://huggingface.co/bhavikardeshna/xlm-roberta-base-vietnamese) on the [UIT-ViQuAD](https://github.com/windhashira06/Demo-QA-Extraction-system/blob/main/Dataset/UIT-ViQuAD.json) dataset.
|
15 |
|
16 |
## Model description
|
17 |
|
18 |
+
The model is described in [Cascading Adaptors to Leverage English Data to Improve Performance of
|
19 |
+
Question Answering for Low-Resource Languages](https://arxiv.org/pdf/2112.09866v1.pdf) paper
|
20 |
|
21 |
## Intended uses & limitations
|
22 |
|
|
|
24 |
|
25 |
## Training and evaluation data
|
26 |
|
27 |
+
A new dataset for the low-resource language as Vietnamese to evaluate MRC models. This dataset comprises over 23,000 human-generated question-answer pairs based on 5,109 passages of 174 Vietnamese articles from Wikipedia. However in processing, I eliminated more than 3000 questions with no answers.
|
28 |
+
|
29 |
|
30 |
## Training procedure
|
31 |
|