update readme
Browse files
README.md
CHANGED
@@ -1,3 +1,56 @@
|
|
1 |
---
|
2 |
license: mit
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
language:
|
4 |
+
- ja
|
5 |
+
pipeline_tag: text-generation
|
6 |
---
|
7 |
+
# Model Card for Tanrei/GPTSAN-japanese
|
8 |
+
|
9 |
+
General-purpose Swich transformer based Japanese language model
|
10 |
+
|
11 |
+
## Text Generation
|
12 |
+
|
13 |
+
```python
|
14 |
+
>>> from transformers import AutoModel, AutoTokenizer
|
15 |
+
>>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese")
|
16 |
+
>>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
|
17 |
+
>>> x_tok = tokenizer.encode("武田信玄は、")
|
18 |
+
>>> model = model.cuda()
|
19 |
+
>>> res = model.generator.generate_lm(x_tok, tokenizer, connected_inputs=0)
|
20 |
+
>>> res[0]
|
21 |
+
'勝頼の父であり、天正四年(1576)に死去するまで甲府14万石の大名として甲府を治めた戦国大名ですが...'
|
22 |
+
```
|
23 |
+
|
24 |
+
## Masked Language Model
|
25 |
+
|
26 |
+
```python
|
27 |
+
>>> from transformers import AutoModel, AutoTokenizer
|
28 |
+
>>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese")
|
29 |
+
>>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
|
30 |
+
>>> x_tok = tokenizer.encode("武田信玄は、<|inputmask|>時代ファンならぜひ押さえ<|inputmask|>きたい名将の一人。")
|
31 |
+
>>> model = model.cuda()
|
32 |
+
>>> res = model.generator.predict_mlm(x_tok, tokenizer)
|
33 |
+
>>> res[0]
|
34 |
+
'武田信玄は、戦国時代ファンならぜひ押さえておきたい名将の一人。'
|
35 |
+
```
|
36 |
+
|
37 |
+
|
38 |
+
# Model Details
|
39 |
+
|
40 |
+
## Model Description
|
41 |
+
|
42 |
+
Japanese language model using Switch Transformer.
|
43 |
+
It has the same structure as the model introduced as `Prefix LM` in the T5 paper, and works with both Test Generation and Masked Language Model.
|
44 |
+
|
45 |
+
|
46 |
+
|
47 |
+
- **Developed by:** Toshiyuki Sakamoto (tanreinama)
|
48 |
+
- **Model type:** Switch Transformer
|
49 |
+
- **Language(s) (NLP):** Japanese
|
50 |
+
- **License:** MIT License
|
51 |
+
|
52 |
+
## Model Sources
|
53 |
+
|
54 |
+
<!-- Provide the basic links for the model. -->
|
55 |
+
|
56 |
+
- **Repository:** https://github.com/tanreinama/GPTSAN
|