saattrupdan commited on
Commit
aaea922
·
verified ·
1 Parent(s): 34bb99a

Add BOS token to config

Browse files

The tokenizer already has the BOS token `<|startoftext|>` in its vocabulary, but it is currently not set in the configuration and thus not used. This is causing issues with several downstream libraries that depend on the existence of a BOS token. This PR simply sets it.

Files changed (1) hide show
  1. config.json +3 -0
config.json CHANGED
@@ -4,6 +4,9 @@
4
  ],
5
  "attention_bias": false,
6
  "attention_dropout": 0.0,
 
 
 
7
  "eos_token_id": 11,
8
  "head_dim": 256,
9
  "hidden_act": "silu",
 
4
  ],
5
  "attention_bias": false,
6
  "attention_dropout": 0.0,
7
+ "bos_token": "<|startoftext|>",
8
+ "bos_token_id": 10,
9
+ "eos_token": "<|endoftext|>",
10
  "eos_token_id": 11,
11
  "head_dim": 256,
12
  "hidden_act": "silu",