Commit
·
b093c27
1
Parent(s):
fea26e5
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,8 +1,7 @@
|
|
| 1 |
---
|
| 2 |
license: cc-by-3.0
|
| 3 |
datasets:
|
| 4 |
-
- VMware/open-instruct
|
| 5 |
-
- conceptofmind/cot_submix_original
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
library_name: transformers
|
|
@@ -15,9 +14,11 @@ Instruction-tuned version of SalesForce/Xgen-7b-8k-base. The model is open for <
|
|
| 15 |
<b> NOTE </b> : The model was trained using the Alpaca prompt template <br>
|
| 16 |
<b> NOTE </b> : tiktoken library is required for the tokenizer. Set trust_remote_code=True when launching the tokenizer.<br>
|
| 17 |
|
| 18 |
-
We expanded Open-instruct with additional commercially viable zero-shot COT datasets from Flan v2
|
| 19 |
|
| 20 |
|
|
|
|
|
|
|
| 21 |
Open-instruct-v1
|
| 22 |
- Mosaic/Dolly-HHRLHF + filtered OASST1 - cc by 3.0
|
| 23 |
|
|
@@ -38,8 +39,9 @@ The model supports up to <b>8192 tokens </b>
|
|
| 38 |
|
| 39 |
## License
|
| 40 |
- <b>Commercially Viable </b>
|
| 41 |
-
- The instruction datasets used for instruction tuning are open for commercial usage.
|
| 42 |
- Language Model, ([Salesforce/xgen-7b-8k-base](https://huggingface.co/Salesforce/xgen-7b-8k-base)) is under apache-2.0
|
|
|
|
| 43 |
|
| 44 |
|
| 45 |
|
|
|
|
| 1 |
---
|
| 2 |
license: cc-by-3.0
|
| 3 |
datasets:
|
| 4 |
+
- VMware/open-instruct
|
|
|
|
| 5 |
language:
|
| 6 |
- en
|
| 7 |
library_name: transformers
|
|
|
|
| 14 |
<b> NOTE </b> : The model was trained using the Alpaca prompt template <br>
|
| 15 |
<b> NOTE </b> : tiktoken library is required for the tokenizer. Set trust_remote_code=True when launching the tokenizer.<br>
|
| 16 |
|
| 17 |
+
We expanded Open-instruct with additional commercially viable zero-shot COT datasets from Flan v2 to total of 140k instruct-prompt responses. <br>
|
| 18 |
|
| 19 |
|
| 20 |
+
<b>Open-instruct <br>
|
| 21 |
+
|
| 22 |
Open-instruct-v1
|
| 23 |
- Mosaic/Dolly-HHRLHF + filtered OASST1 - cc by 3.0
|
| 24 |
|
|
|
|
| 39 |
|
| 40 |
## License
|
| 41 |
- <b>Commercially Viable </b>
|
| 42 |
+
- The instruction datasets used for instruction tuning are open for commercial usage.
|
| 43 |
- Language Model, ([Salesforce/xgen-7b-8k-base](https://huggingface.co/Salesforce/xgen-7b-8k-base)) is under apache-2.0
|
| 44 |
+
- Dataset ([VMware/open-instruct](https://huggingface.co/datasets/VMware/open-instruct)) is under cc-by-sa-3.0
|
| 45 |
|
| 46 |
|
| 47 |
|