Commit
·
3d6e239
1
Parent(s):
8b84700
Update README.md
Browse files
README.md
CHANGED
@@ -84,8 +84,8 @@ The following hyperparameters were used during training:
|
|
84 |
| 3.69 | 0.97 | 3000 | 2.8778 | 0.1662 | 0.0237 | 0.1647 | 0.4694 |
|
85 |
|
86 |
The mT5 model of google cannot be used for Korean although it is trained over 101 languages. Finetuning
|
87 |
-
using very large data set
|
88 |
-
Since GPU memories allowed for free use
|
89 |
to obtain better results. Theoretically, this might give better results. But actual attempts fail to yield
|
90 |
better results. Instead, the results become worse. One should use other
|
91 |
models like the ke-t5 by KETI(한국전자연구원).
|
|
|
84 |
| 3.69 | 0.97 | 3000 | 2.8778 | 0.1662 | 0.0237 | 0.1647 | 0.4694 |
|
85 |
|
86 |
The mT5 model of google cannot be used for Korean although it is trained over 101 languages. Finetuning
|
87 |
+
using very large data set such as bongsoo/news_talk_en_ko still yield garbage.
|
88 |
+
Since GPU memories allowed for free use in colab are greatly limited, repeated fine-tunings for the split datasets are performed
|
89 |
to obtain better results. Theoretically, this might give better results. But actual attempts fail to yield
|
90 |
better results. Instead, the results become worse. One should use other
|
91 |
models like the ke-t5 by KETI(한국전자연구원).
|