Tanrei commited on
Commit
4b02c2c
·
1 Parent(s): 506516b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -41,8 +41,7 @@ General-purpose Swich transformer based Japanese language model
41
  ```
42
 
43
 
44
-
45
- ## Masked Language Model
46
 
47
  ```python
48
  >>> from transformers import AutoModel, AutoTokenizer, trainer_utils
@@ -51,9 +50,13 @@ General-purpose Swich transformer based Japanese language model
51
  >>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese").to(device)
52
  >>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
53
  >>> x_token = tokenizer.encode("", prefix_text="武田信玄は、<|inputmask|>時代ファンならぜひ押さえ<|inputmask|>きたい名将の一人。", return_tensors="pt").to(device)
54
- >>> out_token = model(x_token)
55
- >>> tokenizer.decode(out_token[0].argmax(axis=-1)[0])
 
 
56
  "武田信玄は、戦国時代ファンならぜひ押さえておきたい名将の一人。"
 
 
57
  ```
58
 
59
 
 
41
  ```
42
 
43
 
44
+ ## Masked Language Model And Text Generation
 
45
 
46
  ```python
47
  >>> from transformers import AutoModel, AutoTokenizer, trainer_utils
 
50
  >>> model = AutoModel.from_pretrained("Tanrei/GPTSAN-japanese").to(device)
51
  >>> tokenizer = AutoTokenizer.from_pretrained("Tanrei/GPTSAN-japanese")
52
  >>> x_token = tokenizer.encode("", prefix_text="武田信玄は、<|inputmask|>時代ファンならぜひ押さえ<|inputmask|>きたい名将の一人。", return_tensors="pt").to(device)
53
+ >>> trainer_utils.set_seed(30)
54
+ >>> out_lm_token = model.generate(x_token, max_new_tokens=50)
55
+ >>> out_mlm_token = model(x_token)[0].argmax(axis=-1)
56
+ >>> tokenizer.decode(out_mlm_token[0])
57
  "武田信玄は、戦国時代ファンならぜひ押さえておきたい名将の一人。"
58
+ >>> tokenizer.decode(out_lm_token[0][x_token.shape[1]:])
59
+ "武田氏の三代に渡った武田家のひとり\n甲斐市に住む、日本史上最大の戦国大名。"
60
  ```
61
 
62