Tanrei commited on
Commit
506516b
·
1 Parent(s): cf0b5d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -7
README.md CHANGED
@@ -11,15 +11,49 @@ General-purpose Swich transformer based Japanese language model
11
  ## Text Generation
12
 
13
  ```python
14
- >>> from transformers import AutoModel, AutoTokenizer
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
- >>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  >>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
18
- >>> x_tok = tokenizer.encode("武田信玄は、", return_tensors="pt")
19
- >>> model = model.cuda()
20
- >>> c = model.generate(x_tok.cuda(), max_new_tokens=50, random_seed=63)
21
- >>> tokenizer.decode(c[0])
22
- '武田信玄は、戦国の頃より「智勇兼備」した英雄として織田信長に比されてきた戦国武将であり、...'
23
  ```
24
 
25
 
 
11
  ## Text Generation
12
 
13
  ```python
14
+ >>> from transformers import AutoModel, AutoTokenizer, trainer_utils
15
+ >>>
16
+ >>> device = "cuda"
17
+ >>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese").to(device)
18
+ >>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
19
+ >>> x_token = tokenizer.encode("織田信長は、", return_tensors="pt").to(device)
20
+ >>> trainer_utils.set_seed(30)
21
+ >>> gen_token = model.generate(x_token, max_new_tokens=50)
22
+ >>> tokenizer.decode(gen_token[0])
23
+ "織田信長は、政治・軍事の中枢まで掌握した政治家であり、日本史上類を見ない驚異的な軍事侵攻を続け..."
24
+ ```
25
+
26
+
27
 
28
+ ## Text Generation with Prefix-LM model
29
+
30
+ ```python
31
+ >>> from transformers import AutoModel, AutoTokenizer, trainer_utils
32
+ >>>
33
+ >>> device = "cuda"
34
+ >>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese").to(device)
35
+ >>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
36
+ >>> x_token = tokenizer.encode("", prefix_text="織田信長は、", return_tensors="pt").to(device)
37
+ >>> trainer_utils.set_seed(30)
38
+ >>> gen_token = model.generate(x_token, max_new_tokens=50)
39
+ >>> tokenizer.decode(gen_token[0])
40
+ "織田信長は、政治・外交で数々の戦果を上げるが、1568年からは、いわゆる本能寺の変で細川晴元に暗殺される..."
41
+ ```
42
+
43
+
44
+
45
+ ## Masked Language Model
46
+
47
+ ```python
48
+ >>> from transformers import AutoModel, AutoTokenizer, trainer_utils
49
+ >>>
50
+ >>> device = "cuda"
51
+ >>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese").to(device)
52
  >>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
53
+ >>> x_token = tokenizer.encode("", prefix_text="武田信玄は、<|inputmask|>時代ファンならぜひ押さえ<|inputmask|>きたい名将の一人。", return_tensors="pt").to(device)
54
+ >>> out_token = model(x_token)
55
+ >>> tokenizer.decode(out_token[0].argmax(axis=-1)[0])
56
+ "武田信玄は、戦国時代ファンならぜひ押さえておきたい名将の一人。"
 
57
  ```
58
 
59