|
--- |
|
license: cc-by-4.0 |
|
--- |
|
|
|
# FiD model trained on TQA |
|
|
|
-- This is the model checkpoint of FiD [2], based on the T5 large (with 770M parameters) and trained on the TriviaQA dataset [1]. |
|
|
|
-- Hyperparameters: 8 x 40GB A100 GPUs; batch size 8; AdamW; LR 3e-5; 30000 steps |
|
|
|
References: |
|
|
|
[1] TriviaQA: A Large Scale Dataset for Reading Comprehension and Question Answering. ACL 2017 |
|
|
|
[2] Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. EACL 2021. |
|
|
|
## Model performance |
|
|
|
We evaluate it on the TriviaQA dataset, the EM score is 68.5 (0.8 higher than the original performance reported in the paper). |
|
|
|
|
|
<a href="https://huggingface.co/exbert/?model=bert-base-uncased"> |
|
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png"> |
|
</a> |
|
--- |
|
license: cc-by-4.0 |
|
--- |
|
|