TheBloke commited on
Commit
3f52375
1 Parent(s): 2b42986

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -11
README.md CHANGED
@@ -7,7 +7,7 @@ license_name: yi-license
7
  model_creator: 01-ai
8
  model_name: Yi 34B
9
  model_type: yi
10
- prompt_template: '{prompt}
11
 
12
  '
13
  quantized_by: TheBloke
@@ -54,10 +54,10 @@ These files were quantised using hardware kindly provided by [Massed Compute](ht
54
  <!-- repositories-available end -->
55
 
56
  <!-- prompt-template start -->
57
- ## Prompt template: None
58
 
59
  ```
60
- {prompt}
61
 
62
  ```
63
 
@@ -228,7 +228,7 @@ from huggingface_hub import InferenceClient
228
  endpoint_url = "https://your-endpoint-url-here"
229
 
230
  prompt = "Tell me about AI"
231
- prompt_template=f'''{prompt}
232
  '''
233
 
234
  client = InferenceClient(endpoint_url)
@@ -281,7 +281,7 @@ model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
281
  tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
282
 
283
  prompt = "Tell me about AI"
284
- prompt_template=f'''{prompt}
285
  '''
286
 
287
  print("\n\n*** Generate:")
@@ -365,13 +365,19 @@ And thank you again to a16z for their generous grant.
365
 
366
  The **Yi** series models are large language models trained from scratch by
367
  developers at [01.AI](https://01.ai/). The first public release contains two
368
- bilingual(English/Chinese) base models with the parameter sizes of 6B and 34B.
369
- Both of them are trained with 4K sequence length and can be extended to 32K
370
- during inference time.
 
 
 
371
 
372
  ## News
373
 
374
- - 🎯 **2023/11/02**: The base model of `Yi-6B` and `Yi-34B`.
 
 
 
375
 
376
 
377
  ## Model Performance
@@ -388,8 +394,9 @@ during inference time.
388
  | Aquila-34B | 67.8 | 71.4 | 63.1 | - | - | - | - | - |
389
  | Falcon-180B | 70.4 | 58.0 | 57.8 | 59.0 | 54.0 | 77.3 | 68.8 | 34.0 |
390
  | Yi-6B | 63.2 | 75.5 | 72.0 | 72.2 | 42.8 | 72.3 | 68.7 | 19.8 |
391
- | **Yi-34B** | **76.3** | **83.7** | **81.4** | **82.8** | **54.3** | **80.1** | **76.4** | 37.1 |
392
-
 
393
 
394
  While benchmarking open-source models, we have observed a disparity between the
395
  results generated by our pipeline and those reported in public sources (e.g.
 
7
  model_creator: 01-ai
8
  model_name: Yi 34B
9
  model_type: yi
10
+ prompt_template: 'Human: {prompt} Assistant:
11
 
12
  '
13
  quantized_by: TheBloke
 
54
  <!-- repositories-available end -->
55
 
56
  <!-- prompt-template start -->
57
+ ## Prompt template: Yi
58
 
59
  ```
60
+ Human: {prompt} Assistant:
61
 
62
  ```
63
 
 
228
  endpoint_url = "https://your-endpoint-url-here"
229
 
230
  prompt = "Tell me about AI"
231
+ prompt_template=f'''Human: {prompt} Assistant:
232
  '''
233
 
234
  client = InferenceClient(endpoint_url)
 
281
  tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
282
 
283
  prompt = "Tell me about AI"
284
+ prompt_template=f'''Human: {prompt} Assistant:
285
  '''
286
 
287
  print("\n\n*** Generate:")
 
365
 
366
  The **Yi** series models are large language models trained from scratch by
367
  developers at [01.AI](https://01.ai/). The first public release contains two
368
+ bilingual(English/Chinese) base models with the parameter sizes of 6B([`Yi-6B`](https://huggingface.co/01-ai/Yi-6B))
369
+ and 34B([`Yi-34B`](https://huggingface.co/01-ai/Yi-34B)). Both of them are trained
370
+ with 4K sequence length and can be extended to 32K during inference time.
371
+ The [`Yi-6B-200K`](https://huggingface.co/01-ai/Yi-6B-200K)
372
+ and [`Yi-34B-200K`](https://huggingface.co/01-ai/Yi-34B-200K) are base model with
373
+ 200K context length.
374
 
375
  ## News
376
 
377
+ - 🎯 **2023/11/06**: The base model of [`Yi-6B-200K`](https://huggingface.co/01-ai/Yi-6B-200K)
378
+ and [`Yi-34B-200K`](https://huggingface.co/01-ai/Yi-34B-200K) with 200K context length.
379
+ - 🎯 **2023/11/02**: The base model of [`Yi-6B`](https://huggingface.co/01-ai/Yi-6B) and
380
+ [`Yi-34B`](https://huggingface.co/01-ai/Yi-34B).
381
 
382
 
383
  ## Model Performance
 
394
  | Aquila-34B | 67.8 | 71.4 | 63.1 | - | - | - | - | - |
395
  | Falcon-180B | 70.4 | 58.0 | 57.8 | 59.0 | 54.0 | 77.3 | 68.8 | 34.0 |
396
  | Yi-6B | 63.2 | 75.5 | 72.0 | 72.2 | 42.8 | 72.3 | 68.7 | 19.8 |
397
+ | Yi-6B-200K | 64.0 | 75.3 | 73.5 | 73.9 | 42.0 | 72.0 | 69.1 | 19.0 |
398
+ | **Yi-34B** | **76.3** | **83.7** | 81.4 | 82.8 | **54.3** | **80.1** | 76.4 | 37.1 |
399
+ | Yi-34B-200K | 76.1 | 83.6 | **81.9** | **83.4** | 52.7 | 79.7 | **76.6** | 36.3 |
400
 
401
  While benchmarking open-source models, we have observed a disparity between the
402
  results generated by our pipeline and those reported in public sources (e.g.