cstr commited on
Commit
bb4e2fa
·
verified ·
1 Parent(s): bda891d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - cstr/llama3.1-8b-spaetzle-v59
4
+ - cstr/llama3.1-8b-spaetzle-v63
5
+ - cstr/llama3.1-8b-spaetzle-v66
6
+ - cstr/llama3.1-8b-spaetzle-v73
7
+ tags:
8
+ - merge
9
+ - mergekit
10
+ license: llama3
11
+ language:
12
+ - en
13
+ - de
14
+ library_name: transformers
15
+ ---
16
+
17
+ # llama3.1-8b-spaetzle-v74
18
+
19
+ llama3.1-8b-spaetzle-v74 is a merge of the following models:
20
+ * [cstr/llama3.1-8b-spaetzle-v59](https://huggingface.co/cstr/llama3.1-8b-spaetzle-v59)
21
+ * [cstr/llama3.1-8b-spaetzle-v63](https://huggingface.co/cstr/llama3.1-8b-spaetzle-v63)
22
+ * [cstr/llama3.1-8b-spaetzle-v66](https://huggingface.co/cstr/llama3.1-8b-spaetzle-v66)
23
+ * [cstr/llama3.1-8b-spaetzle-v73](https://huggingface.co/cstr/llama3.1-8b-spaetzle-v73)
24
+
25
+ EQ-Bench v2_de: 68.05 169/171, en: 75.27 - which is not the best, but it produces decent answers for some trick questions, and i have a sweet spot for that ;)
26
+
27
+ ## 🧩 Configuration
28
+
29
+ ```yamlmodels:
30
+ models:
31
+ - model: cstr/llama3.1-8b-spaetzle-v59
32
+ parameters:
33
+ weight: 0.3
34
+ density: 0.5
35
+ - model: cstr/llama3.1-8b-spaetzle-v63
36
+ parameters:
37
+ weight: 0.15
38
+ density: 0.5
39
+ - model: cstr/llama3.1-8b-spaetzle-v66
40
+ parameters:
41
+ weight: 0.15
42
+ density: 0.5
43
+ - model: cstr/llama3.1-8b-spaetzle-v73
44
+ parameters:
45
+ weight: 0.4
46
+ density: 0.5
47
+ base_model: cstr/llama3.1-8b-spaetzle-v59
48
+ merge_method: della_linear
49
+ parameters:
50
+ int8_mask: true
51
+ normalize: true
52
+ epsilon: 0.1
53
+ lambda: 1.0
54
+ density: 0.7
55
+ dtype: bfloat16
56
+ ```
57
+
58
+ ## 💻 Usage
59
+
60
+ ```python
61
+ !pip install -qU transformers accelerate
62
+
63
+ from transformers import AutoTokenizer
64
+ import transformers
65
+ import torch
66
+
67
+ model = "cstr/llama3.1-8b-spaetzle-v74"
68
+ messages = [{"role": "user", "content": "What is a large language model?"}]
69
+
70
+ tokenizer = AutoTokenizer.from_pretrained(model)
71
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
72
+ pipeline = transformers.pipeline(
73
+ "text-generation",
74
+ model=model,
75
+ torch_dtype=torch.float16,
76
+ device_map="auto",
77
+ )
78
+
79
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
80
+ print(outputs[0]["generated_text"])
81
+ ```