Create readme.md
Browse files
readme.md
ADDED
|
@@ -0,0 +1,104 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- eng
|
| 4 |
+
tags:
|
| 5 |
+
- llama-2
|
| 6 |
+
- sft
|
| 7 |
+
license:
|
| 8 |
+
- mit
|
| 9 |
+
datasets:
|
| 10 |
+
- LDJnr/Puffin
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+

|
| 14 |
+
|
| 15 |
+
## **Redmond-Puffin-70B**
|
| 16 |
+
|
| 17 |
+
**The first commercially available language model released by Nous Research!**
|
| 18 |
+
|
| 19 |
+
This is a larger version of Puffin-
|
| 20 |
+
which was originally the worlds first third-party llama-2 fine-tune. leveraging a hand curated set of 3K high quality examples, many of which take full advantage of the 4096 context length of Llama 2. This model was fine-tuned by Nous Research, with LDJ leading the training and dataset curation, along with significant dataset formation contributions by J-Supha.
|
| 21 |
+
|
| 22 |
+
Special thank you to Pygmalion AI for sponsoring the compute.
|
| 23 |
+
|
| 24 |
+
Special thank you to Emozilla for assisting with training experimentations and benchmarking.
|
| 25 |
+
|
| 26 |
+
## Model Training
|
| 27 |
+
|
| 28 |
+
Redmond-Puffin 70B is a new model trained for multiple epochs on a dataset of 3,000 carefully curated GPT-4 examples, most of which are long context conversations between a real human and GPT-4.
|
| 29 |
+
|
| 30 |
+
Additional data came from carefully curated sub sections of datasets such as CamelAI's Physics, Chemistry, Biology and Math.
|
| 31 |
+
|
| 32 |
+
## Prompt Format
|
| 33 |
+
|
| 34 |
+
The reccomended model usage is:
|
| 35 |
+
|
| 36 |
+
```
|
| 37 |
+
### human:
|
| 38 |
+
|
| 39 |
+
### response:
|
| 40 |
+
```
|
| 41 |
+
Optional reccomended pre-prompt / system prompt:
|
| 42 |
+
|
| 43 |
+
```
|
| 44 |
+
### human: Interact in conversation to the best of your ability, please be concise, logical, intelligent and coherent.
|
| 45 |
+
|
| 46 |
+
### response: Sure! sounds good.
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
## When should I use Puffin or Hermes 2?
|
| 50 |
+
|
| 51 |
+
Although full benchmarks have not completed for Puffin,
|
| 52 |
+
Original Puffin 13B and Hermes-2 13B both beat previous SOTA for GPT4ALL benchmarks, with Hermes-2 winning by a 0.1% margin over Puffin.
|
| 53 |
+
|
| 54 |
+
Overall, for general purpose zero-shot and/or single turn instructions, Hermes will likely be the way to go. Puffin may be prefferred for creative long conversation interactions, like having Puffin play a character or help brain storm creative ideas or concepts that make contextual sense within an already deep conversation.
|
| 55 |
+
|
| 56 |
+
Thank you to the comprehensive analysis and comparison of Puffin and Hermes by reddit user WolframRavenwolf here: https://www.reddit.com/r/LocalLLaMA/comments/158j9r9/nous_hermes_llama2_vs_redmond_puffin_13b/
|
| 57 |
+
|
| 58 |
+
## Example Outputs!:
|
| 59 |
+
|
| 60 |
+

|
| 61 |
+
|
| 62 |
+

|
| 63 |
+
|
| 64 |
+

|
| 65 |
+
|
| 66 |
+

|
| 67 |
+
|
| 68 |
+

|
| 69 |
+
|
| 70 |
+
## Notable Features:
|
| 71 |
+
|
| 72 |
+
- The first Llama-2 based fine-tuned model released by Nous Research.
|
| 73 |
+
|
| 74 |
+
- Ability to recall information upto 2023 without internet (ChatGPT cut off date is in 2021)
|
| 75 |
+
|
| 76 |
+
- Pretrained on 2 trillion tokens of text. (This is double the amount of most Open LLM's)
|
| 77 |
+
|
| 78 |
+
- Pretrained with a context length of 4096 tokens, and fine-tuned on a significant amount of multi-turn conversations reaching that full token limit.
|
| 79 |
+
|
| 80 |
+
- The first commercially available language model released by Nous Research.
|
| 81 |
+
|
| 82 |
+
## Future Plans
|
| 83 |
+
|
| 84 |
+
This is a relatively early build amongst the grand plans for the future of Puffin!
|
| 85 |
+
|
| 86 |
+
Current limitations: Some token mismatch problems have been identified, these may effect the current output quality, we plan to have this solved in Puffin V2 along with other improvements.
|
| 87 |
+
|
| 88 |
+
## How you can help!
|
| 89 |
+
|
| 90 |
+
In the near future we plan on leveraging the help of domain specific expert volunteers to eliminate any mathematically/verifiably incorrect answers from our training curations.
|
| 91 |
+
|
| 92 |
+
If you have at-least a bachelors in mathematics, physics, biology or chemistry and would like to volunteer even just 30 minutes of your expertise time, please contact LDJ on discord!
|
| 93 |
+
|
| 94 |
+
## Benchmarks (New benchmarks coming soon, however here are the 13B benchmarks for now)!
|
| 95 |
+
|
| 96 |
+
As of Puffins release, it achieves a new SOTA for the GPT4All benchmarks! Supplanting Hermes for the #1 position!
|
| 97 |
+
(Rounded to nearest tenth)
|
| 98 |
+
|
| 99 |
+
Previous Sota: Hermes - 68.8
|
| 100 |
+
New Sota: Puffin - 69.9 (+1.1)
|
| 101 |
+
|
| 102 |
+
Puffin 13B supplants Hermes-2 for the #1 spot in Arc-E, HellaSwag and Winogrande!
|
| 103 |
+
|
| 104 |
+
Puffin also perfectly ties with Hermes in PIQA, however Hermes-2 still excels in much of Big Bench and AGIEval, so it's highly reccomended you give it a try as well!
|