Update README.md
Browse files
README.md
CHANGED
@@ -10,17 +10,17 @@ pipeline_tag: text-generation
|
|
10 |
# 🌿 Shurale7B-v1: Narrative based chit-chat model
|
11 |
|
12 |
Developed
|
13 |
-
by [@BobaZooba](https://
|
14 |
|
15 |
-
[<img src="https://cdn-uploads.huggingface.co/production/uploads/6074d5f1134c000d1ae10d42/JudU3rrPP5i87CfwINANO.png" alt="Powered by X—LLM" width="175" height="32"/>](https://github.com/
|
16 |
|
17 |
# 🪄 About
|
18 |
|
19 |
Model based on [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
|
20 |
|
21 |
-
[GitHub Repo](https://github.com/
|
22 |
|
23 |
-
[<img src="https://cdn-uploads.huggingface.co/production/uploads/6074d5f1134c000d1ae10d42/4y7RfOdhxvh1Tim99uLkW.png" alt="Chat with Shurale" width="120" height="40"/>](https://t.me/
|
24 |
|
25 |
| **HuggingFace Hub** | **7B** | **7B-GPTQ** |
|
26 |
|---------------------|---------------------------------------------------------------|-------------------------------------------------------------|
|
@@ -42,7 +42,7 @@ Model based on [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.
|
|
42 |
|
43 |
> Shurale [/ʃʊrɑˈlʲe/] is a forest spirit in Bashkir and Tatar mythology.
|
44 |
|
45 |
-
[Do you want models as cool as this one?](https://
|
46 |
|
47 |
</div>
|
48 |
|
@@ -128,7 +128,15 @@ don't you dare let me down!
|
|
128 |
|
129 |
# 🔧 How to use
|
130 |
|
131 |
-
Recommended
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
132 |
|
133 |
## Transformers
|
134 |
|
@@ -137,8 +145,8 @@ Recommended **top_p** for sampling: 0.9
|
|
137 |
```python
|
138 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
139 |
|
140 |
-
tokenizer = AutoTokenizer.from_pretrained("
|
141 |
-
model = AutoModelForCausalLM.from_pretrained("
|
142 |
```
|
143 |
|
144 |
2. Run generation
|
@@ -172,11 +180,11 @@ https://github.com/huggingface/text-generation-inference#get-started
|
|
172 |
### Docker
|
173 |
|
174 |
```bash
|
175 |
-
model=
|
176 |
volume=$PWD/data
|
177 |
version=1.1.0 # please make sure you are using latest or stable version (>= 1.1.0)
|
178 |
|
179 |
-
docker run --gpus all --shm-size 1g -p
|
180 |
$volume:/data ghcr.io/huggingface/text-generation-inference:$version \
|
181 |
--model-id $model --max-batch-prefill-tokens 2048 --dtype bfloat16
|
182 |
```
|
@@ -191,7 +199,7 @@ https://www.runpod.io/console/gpu-cloud
|
|
191 |
| Field | Value |
|
192 |
|-------------------|-----------------------------------------------------------------------------------------------------------------------------|
|
193 |
| Container Image | ghcr.io/huggingface/text-generation-inference:1.1.0 |
|
194 |
-
| Docker Command | --model-id
|
195 |
| Container Disk | 5 |
|
196 |
| Volume Disk | 15 |
|
197 |
| Volume Mount Path | /data |
|
@@ -252,7 +260,7 @@ print(text)
|
|
252 |
|
253 |
# 🚄 Training Process
|
254 |
|
255 |
-
[<img src="https://cdn-uploads.huggingface.co/production/uploads/6074d5f1134c000d1ae10d42/JudU3rrPP5i87CfwINANO.png" alt="Powered by X—LLM" width="175" height="32"/>](https://github.com/
|
256 |
|
257 |
## Dataset
|
258 |
|
@@ -462,6 +470,18 @@ while True:
|
|
462 |
|
463 |
# 📋 Dialog examples
|
464 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
465 |
<details>
|
466 |
<summary>Example #1</summary>
|
467 |
|
@@ -589,29 +609,3 @@ Coming soon... (maybe will be in V2)
|
|
589 |
If this model proves successful, I plan to implement an algorithm similar to DeepMind's
|
590 |
ReST ([link](https://arxiv.org/pdf/2308.08998.pdf)). The mentioned work has great potential but has a number of
|
591 |
shortcomings, which I've managed to address in my approach.
|
592 |
-
|
593 |
-
---
|
594 |
-
|
595 |
-
# 🚀 Call to action
|
596 |
-
|
597 |
-
**Looking for an expert in modern LLMs?** I've got the experience you need. I'll guide you through every step,
|
598 |
-
fine-tuning everything from data collection to model training and improvement.
|
599 |
-
|
600 |
-
**Why me?** Well, with six years of experience in deep learning R&D projects, I've mastered a range of roles - from
|
601 |
-
leading a team to rolling up my sleeves as an engineer. I've built and improved products from scratch and I'm keen to do
|
602 |
-
the same for you.
|
603 |
-
|
604 |
-
**Worried about your team?** Don't be. With four years as a lecturer at Russia’s best university, I can equip them with
|
605 |
-
the skills they need to succeed.
|
606 |
-
|
607 |
-
**Want to know more?** Check
|
608 |
-
out [my CV](https://docs.google.com/document/d/1BhFvIHQ1mpm81P-n2A-lhNac-U2wOGc6F2uS9gKvk88/edit?usp=sharing), [LinkedIn](https://www.linkedin.com/in/boriszubarev/),
|
609 |
-
and [past projects](https://komplete.framer.ai/cases) for the full scoop.
|
610 |
-
|
611 |
-
**Ready to start?** Let's arrange a free intro meeting. I'll outline the resources we'll need to make your project a
|
612 |
-
success.
|
613 |
-
[Contact me form](https://komplete.framer.ai/#contact)
|
614 |
-
|
615 |
-
If you're an engineer, I'd appreciate it if you could pass
|
616 |
-
along [my LinkedIn](https://www.linkedin.com/in/boriszubarev/) or [website](https://komplete.framer.ai/) to your
|
617 |
-
manager.
|
|
|
10 |
# 🌿 Shurale7B-v1: Narrative based chit-chat model
|
11 |
|
12 |
Developed
|
13 |
+
by [@BobaZooba](https://t.me/BobaZooba) | [CV](https://docs.google.com/document/d/1BhFvIHQ1mpm81P-n2A-lhNac-U2wOGc6F2uS9gKvk88/edit?usp=sharing) | [LinkedIn](https://www.linkedin.com/in/boriszubarev/) | [[email protected]](mailto:[email protected])
|
14 |
|
15 |
+
[<img src="https://cdn-uploads.huggingface.co/production/uploads/6074d5f1134c000d1ae10d42/JudU3rrPP5i87CfwINANO.png" alt="Powered by X—LLM" width="175" height="32"/>](https://github.com/BobaZooba/xllm)
|
16 |
|
17 |
# 🪄 About
|
18 |
|
19 |
Model based on [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
|
20 |
|
21 |
+
[GitHub Repo](https://github.com/BobaZooba/shurale) | [Detailed step-by-step guide how to train this model](https://github.com/BobaZooba/shurale/blob/main/STEP-BY-STEP-GUIDE.md)
|
22 |
|
23 |
+
[<img src="https://cdn-uploads.huggingface.co/production/uploads/6074d5f1134c000d1ae10d42/4y7RfOdhxvh1Tim99uLkW.png" alt="Chat with Shurale" width="120" height="40"/>](https://t.me/TaleQuestBot)
|
24 |
|
25 |
| **HuggingFace Hub** | **7B** | **7B-GPTQ** |
|
26 |
|---------------------|---------------------------------------------------------------|-------------------------------------------------------------|
|
|
|
42 |
|
43 |
> Shurale [/ʃʊrɑˈlʲe/] is a forest spirit in Bashkir and Tatar mythology.
|
44 |
|
45 |
+
[Do you want models as cool as this one?](https://www.linkedin.com/in/boriszubarev/)
|
46 |
|
47 |
</div>
|
48 |
|
|
|
128 |
|
129 |
# 🔧 How to use
|
130 |
|
131 |
+
Recommended generation parameters for sampling:
|
132 |
+
|
133 |
+
| Param | Value |
|
134 |
+
|-----------|-------|
|
135 |
+
| top_p | 0.75 |
|
136 |
+
| typical_p | 0.95 |
|
137 |
+
| top_k | 50 |
|
138 |
+
| temperature | 0.75 |
|
139 |
+
| repetition_penalty | 1.05 |
|
140 |
|
141 |
## Transformers
|
142 |
|
|
|
145 |
```python
|
146 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
147 |
|
148 |
+
tokenizer = AutoTokenizer.from_pretrained("BobaZooba/Shurale7B-v1")
|
149 |
+
model = AutoModelForCausalLM.from_pretrained("BobaZooba/Shurale7B-v1")
|
150 |
```
|
151 |
|
152 |
2. Run generation
|
|
|
180 |
### Docker
|
181 |
|
182 |
```bash
|
183 |
+
model=BobaZooba/Shurale7B-v1
|
184 |
volume=$PWD/data
|
185 |
version=1.1.0 # please make sure you are using latest or stable version (>= 1.1.0)
|
186 |
|
187 |
+
docker run --gpus all --shm-size 1g -p 8081:80 -v \
|
188 |
$volume:/data ghcr.io/huggingface/text-generation-inference:$version \
|
189 |
--model-id $model --max-batch-prefill-tokens 2048 --dtype bfloat16
|
190 |
```
|
|
|
199 |
| Field | Value |
|
200 |
|-------------------|-----------------------------------------------------------------------------------------------------------------------------|
|
201 |
| Container Image | ghcr.io/huggingface/text-generation-inference:1.1.0 |
|
202 |
+
| Docker Command | --model-id BobaZooba/Shurale7B-v1 --num-shard 1 --port 8081 --max-batch-prefill-tokens 2048 --dtype bfloat16 --json-output |
|
203 |
| Container Disk | 5 |
|
204 |
| Volume Disk | 15 |
|
205 |
| Volume Mount Path | /data |
|
|
|
260 |
|
261 |
# 🚄 Training Process
|
262 |
|
263 |
+
[<img src="https://cdn-uploads.huggingface.co/production/uploads/6074d5f1134c000d1ae10d42/JudU3rrPP5i87CfwINANO.png" alt="Powered by X—LLM" width="175" height="32"/>](https://github.com/BobaZooba/xllm)
|
264 |
|
265 |
## Dataset
|
266 |
|
|
|
470 |
|
471 |
# 📋 Dialog examples
|
472 |
|
473 |
+
## Tale Quest
|
474 |
+
|
475 |
+
`Tale Quest` is my personal project which was built using `xllm` and `Shurale`. It's an interactive text-based game
|
476 |
+
in `Telegram` with dynamic AI characters, offering infinite scenarios
|
477 |
+
|
478 |
+
You will get into exciting journeys and complete fascinating quests. Chat
|
479 |
+
with `George Orwell`, `Tech Entrepreneur`, `Young Wizard`, `Noir Detective`, `Femme Fatale` and many more
|
480 |
+
|
481 |
+
Try it now: [https://t.me/talequestbot](https://t.me/PapayaAIBot?start=Z2g)
|
482 |
+
|
483 |
+
Default examples (not as interesting as in TaleQuest):
|
484 |
+
|
485 |
<details>
|
486 |
<summary>Example #1</summary>
|
487 |
|
|
|
609 |
If this model proves successful, I plan to implement an algorithm similar to DeepMind's
|
610 |
ReST ([link](https://arxiv.org/pdf/2308.08998.pdf)). The mentioned work has great potential but has a number of
|
611 |
shortcomings, which I've managed to address in my approach.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|