Update README.md
Browse files
README.md
CHANGED
@@ -4,6 +4,24 @@ license: apache-2.0
|
|
4 |
This is a merge model using Tie merge method.
|
5 |
Created using openchat 3.5 and una-cybertron-7b-v2-bf16.
|
6 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
|
8 |
This model is exceptionally well at labeling data, bringing down labeling cost to server cost. Hurray! Here is an example
|
9 |
|
|
|
4 |
This is a merge model using Tie merge method.
|
5 |
Created using openchat 3.5 and una-cybertron-7b-v2-bf16.
|
6 |
|
7 |
+
Instruction template:
|
8 |
+
```python
|
9 |
+
import transformers
|
10 |
+
tokenizer = transformers.AutoTokenizer.from_pretrained("openchat/openchat_3.5")
|
11 |
+
|
12 |
+
# Single-turn
|
13 |
+
tokens = tokenizer("GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant:").input_ids
|
14 |
+
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]
|
15 |
+
|
16 |
+
# Multi-turn
|
17 |
+
tokens = tokenizer("GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hi<|end_of_turn|>GPT4 Correct User: How are you today?<|end_of_turn|>GPT4 Correct Assistant:").input_ids
|
18 |
+
assert tokens == [1, 420, 6316, 28781, 3198, 3123, 1247, 28747, 22557, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747, 15359, 32000, 420, 6316, 28781, 3198, 3123, 1247, 28747, 1602, 460, 368, 3154, 28804, 32000, 420, 6316, 28781, 3198, 3123, 21631, 28747]
|
19 |
+
|
20 |
+
# Coding Mode
|
21 |
+
tokens = tokenizer("Code User: Implement quicksort using C++<|end_of_turn|>Code Assistant:").input_ids
|
22 |
+
assert tokens == [1, 7596, 1247, 28747, 26256, 2936, 7653, 1413, 334, 1680, 32000, 7596, 21631, 28747]
|
23 |
+
```
|
24 |
+
|
25 |
|
26 |
This model is exceptionally well at labeling data, bringing down labeling cost to server cost. Hurray! Here is an example
|
27 |
|