Update README.md
Browse files
README.md
CHANGED
@@ -100,6 +100,13 @@ This text snippet is then used for your answer. <br>
|
|
100 |
Especially to deal with the context length and I don't mean just the theoretical number you can set.
|
101 |
Some models can handle 128k or 1M tokens, but even with 16k or 32k input the response with the same snippets as input is worse than with other well developed models.<br>
|
102 |
<br>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
103 |
...
|
104 |
# Important -> The Systemprompt (some examples):
|
105 |
<li> The system prompt is weighted with a certain amount of influence around your question. You can easily test it once without or with a nonsensical system prompt.</li>
|
@@ -124,19 +131,8 @@ or:<br>
|
|
124 |
"You are are a warm and engaging companion who loves to talk about cooking, recipes and the joy of food.
|
125 |
Your aim is to share delicious recipes, cooking tips and the stories behind different cultures in a personal, welcoming and knowledgeable way."<br>
|
126 |
<br>
|
127 |
-
...
|
128 |
-
<br>
|
129 |
-
# usual models works well:<br>
|
130 |
-
llama3.1, llama3.2, qwen2.5, deepseek-r1-distill, gemma-3, granite, SauerkrautLM-Nemo(german) ... <br>
|
131 |
-
(llama3 or phi3.5 are not working well) <br><br>
|
132 |
-
<b>⇨</b> best models for english and german:<br>
|
133 |
-
granit3.2-8b (2b version also) - https://huggingface.co/ibm-research/granite-3.2-8b-instruct-GGUF<br>
|
134 |
-
Chocolatine-2-14B (other versions also) - https://huggingface.co/mradermacher/Chocolatine-2-14B-Instruct-DPO-v2.0b11-GGUF<br>
|
135 |
-
QwQ-LCoT- (7/14b) - https://huggingface.co/mradermacher/QwQ-LCoT-14B-Conversational-GGUF<br><br>
|
136 |
-
|
137 |
-
|
138 |
btw. <b>Jinja</b> templates very new ... the usual templates with usual models are fine, but merged models have a lot of optimization potential (but dont ask me iam not a coder)<br>
|
139 |
-
<br>
|
140 |
|
141 |
...
|
142 |
<br>
|
|
|
100 |
Especially to deal with the context length and I don't mean just the theoretical number you can set.
|
101 |
Some models can handle 128k or 1M tokens, but even with 16k or 32k input the response with the same snippets as input is worse than with other well developed models.<br>
|
102 |
<br>
|
103 |
+
llama3.1, llama3.2, qwen2.5, deepseek-r1-distill, gemma-3, granite, SauerkrautLM-Nemo(german) ... <br>
|
104 |
+
(llama3 or phi3.5 are not working well) <br><br>
|
105 |
+
<b>⇨</b> best models for english and german:<br>
|
106 |
+
granit3.2-8b (2b version also) - https://huggingface.co/ibm-research/granite-3.2-8b-instruct-GGUF<br>
|
107 |
+
Chocolatine-2-14B (other versions also) - https://huggingface.co/mradermacher/Chocolatine-2-14B-Instruct-DPO-v2.0b11-GGUF<br>
|
108 |
+
QwQ-LCoT- (7/14b) - https://huggingface.co/mradermacher/QwQ-LCoT-14B-Conversational-GGUF<br><br>
|
109 |
+
|
110 |
...
|
111 |
# Important -> The Systemprompt (some examples):
|
112 |
<li> The system prompt is weighted with a certain amount of influence around your question. You can easily test it once without or with a nonsensical system prompt.</li>
|
|
|
131 |
"You are are a warm and engaging companion who loves to talk about cooking, recipes and the joy of food.
|
132 |
Your aim is to share delicious recipes, cooking tips and the stories behind different cultures in a personal, welcoming and knowledgeable way."<br>
|
133 |
<br>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
134 |
btw. <b>Jinja</b> templates very new ... the usual templates with usual models are fine, but merged models have a lot of optimization potential (but dont ask me iam not a coder)<br>
|
135 |
+
<br><br>
|
136 |
|
137 |
...
|
138 |
<br>
|