Upload README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ license_name: yi-license
|
|
7 |
model_creator: 01-ai
|
8 |
model_name: Yi 34B
|
9 |
model_type: yi
|
10 |
-
prompt_template: '{prompt}
|
11 |
|
12 |
'
|
13 |
quantized_by: TheBloke
|
@@ -54,10 +54,10 @@ These files were quantised using hardware kindly provided by [Massed Compute](ht
|
|
54 |
<!-- repositories-available end -->
|
55 |
|
56 |
<!-- prompt-template start -->
|
57 |
-
## Prompt template:
|
58 |
|
59 |
```
|
60 |
-
{prompt}
|
61 |
|
62 |
```
|
63 |
|
@@ -228,7 +228,7 @@ from huggingface_hub import InferenceClient
|
|
228 |
endpoint_url = "https://your-endpoint-url-here"
|
229 |
|
230 |
prompt = "Tell me about AI"
|
231 |
-
prompt_template=f'''{prompt}
|
232 |
'''
|
233 |
|
234 |
client = InferenceClient(endpoint_url)
|
@@ -281,7 +281,7 @@ model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
|
|
281 |
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
|
282 |
|
283 |
prompt = "Tell me about AI"
|
284 |
-
prompt_template=f'''{prompt}
|
285 |
'''
|
286 |
|
287 |
print("\n\n*** Generate:")
|
@@ -365,13 +365,19 @@ And thank you again to a16z for their generous grant.
|
|
365 |
|
366 |
The **Yi** series models are large language models trained from scratch by
|
367 |
developers at [01.AI](https://01.ai/). The first public release contains two
|
368 |
-
bilingual(English/Chinese) base models with the parameter sizes of 6B
|
369 |
-
Both of them are trained
|
370 |
-
during inference time.
|
|
|
|
|
|
|
371 |
|
372 |
## News
|
373 |
|
374 |
-
- 🎯 **2023/11/
|
|
|
|
|
|
|
375 |
|
376 |
|
377 |
## Model Performance
|
@@ -388,8 +394,9 @@ during inference time.
|
|
388 |
| Aquila-34B | 67.8 | 71.4 | 63.1 | - | - | - | - | - |
|
389 |
| Falcon-180B | 70.4 | 58.0 | 57.8 | 59.0 | 54.0 | 77.3 | 68.8 | 34.0 |
|
390 |
| Yi-6B | 63.2 | 75.5 | 72.0 | 72.2 | 42.8 | 72.3 | 68.7 | 19.8 |
|
391 |
-
|
|
392 |
-
|
|
|
393 |
|
394 |
While benchmarking open-source models, we have observed a disparity between the
|
395 |
results generated by our pipeline and those reported in public sources (e.g.
|
|
|
7 |
model_creator: 01-ai
|
8 |
model_name: Yi 34B
|
9 |
model_type: yi
|
10 |
+
prompt_template: 'Human: {prompt} Assistant:
|
11 |
|
12 |
'
|
13 |
quantized_by: TheBloke
|
|
|
54 |
<!-- repositories-available end -->
|
55 |
|
56 |
<!-- prompt-template start -->
|
57 |
+
## Prompt template: Yi
|
58 |
|
59 |
```
|
60 |
+
Human: {prompt} Assistant:
|
61 |
|
62 |
```
|
63 |
|
|
|
228 |
endpoint_url = "https://your-endpoint-url-here"
|
229 |
|
230 |
prompt = "Tell me about AI"
|
231 |
+
prompt_template=f'''Human: {prompt} Assistant:
|
232 |
'''
|
233 |
|
234 |
client = InferenceClient(endpoint_url)
|
|
|
281 |
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
|
282 |
|
283 |
prompt = "Tell me about AI"
|
284 |
+
prompt_template=f'''Human: {prompt} Assistant:
|
285 |
'''
|
286 |
|
287 |
print("\n\n*** Generate:")
|
|
|
365 |
|
366 |
The **Yi** series models are large language models trained from scratch by
|
367 |
developers at [01.AI](https://01.ai/). The first public release contains two
|
368 |
+
bilingual(English/Chinese) base models with the parameter sizes of 6B([`Yi-6B`](https://huggingface.co/01-ai/Yi-6B))
|
369 |
+
and 34B([`Yi-34B`](https://huggingface.co/01-ai/Yi-34B)). Both of them are trained
|
370 |
+
with 4K sequence length and can be extended to 32K during inference time.
|
371 |
+
The [`Yi-6B-200K`](https://huggingface.co/01-ai/Yi-6B-200K)
|
372 |
+
and [`Yi-34B-200K`](https://huggingface.co/01-ai/Yi-34B-200K) are base model with
|
373 |
+
200K context length.
|
374 |
|
375 |
## News
|
376 |
|
377 |
+
- 🎯 **2023/11/06**: The base model of [`Yi-6B-200K`](https://huggingface.co/01-ai/Yi-6B-200K)
|
378 |
+
and [`Yi-34B-200K`](https://huggingface.co/01-ai/Yi-34B-200K) with 200K context length.
|
379 |
+
- 🎯 **2023/11/02**: The base model of [`Yi-6B`](https://huggingface.co/01-ai/Yi-6B) and
|
380 |
+
[`Yi-34B`](https://huggingface.co/01-ai/Yi-34B).
|
381 |
|
382 |
|
383 |
## Model Performance
|
|
|
394 |
| Aquila-34B | 67.8 | 71.4 | 63.1 | - | - | - | - | - |
|
395 |
| Falcon-180B | 70.4 | 58.0 | 57.8 | 59.0 | 54.0 | 77.3 | 68.8 | 34.0 |
|
396 |
| Yi-6B | 63.2 | 75.5 | 72.0 | 72.2 | 42.8 | 72.3 | 68.7 | 19.8 |
|
397 |
+
| Yi-6B-200K | 64.0 | 75.3 | 73.5 | 73.9 | 42.0 | 72.0 | 69.1 | 19.0 |
|
398 |
+
| **Yi-34B** | **76.3** | **83.7** | 81.4 | 82.8 | **54.3** | **80.1** | 76.4 | 37.1 |
|
399 |
+
| Yi-34B-200K | 76.1 | 83.6 | **81.9** | **83.4** | 52.7 | 79.7 | **76.6** | 36.3 |
|
400 |
|
401 |
While benchmarking open-source models, we have observed a disparity between the
|
402 |
results generated by our pipeline and those reported in public sources (e.g.
|