kalle07 commited on
Commit
0fb7a38
·
verified ·
1 Parent(s): 964ac28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -12
README.md CHANGED
@@ -100,6 +100,13 @@ This text snippet is then used for your answer. <br>
100
  Especially to deal with the context length and I don't mean just the theoretical number you can set.
101
  Some models can handle 128k or 1M tokens, but even with 16k or 32k input the response with the same snippets as input is worse than with other well developed models.<br>
102
  <br>
 
 
 
 
 
 
 
103
  ...
104
  # Important -> The Systemprompt (some examples):
105
  <li> The system prompt is weighted with a certain amount of influence around your question. You can easily test it once without or with a nonsensical system prompt.</li>
@@ -124,19 +131,8 @@ or:<br>
124
  "You are are a warm and engaging companion who loves to talk about cooking, recipes and the joy of food.
125
  Your aim is to share delicious recipes, cooking tips and the stories behind different cultures in a personal, welcoming and knowledgeable way."<br>
126
  <br>
127
- ...
128
- <br>
129
- # usual models works well:<br>
130
- llama3.1, llama3.2, qwen2.5, deepseek-r1-distill, gemma-3, granite, SauerkrautLM-Nemo(german) ... <br>
131
- (llama3 or phi3.5 are not working well) <br><br>
132
- <b>&#x21e8;</b> best models for english and german:<br>
133
- granit3.2-8b (2b version also) - https://huggingface.co/ibm-research/granite-3.2-8b-instruct-GGUF<br>
134
- Chocolatine-2-14B (other versions also) - https://huggingface.co/mradermacher/Chocolatine-2-14B-Instruct-DPO-v2.0b11-GGUF<br>
135
- QwQ-LCoT- (7/14b) - https://huggingface.co/mradermacher/QwQ-LCoT-14B-Conversational-GGUF<br><br>
136
-
137
-
138
  btw. <b>Jinja</b> templates very new ... the usual templates with usual models are fine, but merged models have a lot of optimization potential (but dont ask me iam not a coder)<br>
139
- <br>
140
 
141
  ...
142
  <br>
 
100
  Especially to deal with the context length and I don't mean just the theoretical number you can set.
101
  Some models can handle 128k or 1M tokens, but even with 16k or 32k input the response with the same snippets as input is worse than with other well developed models.<br>
102
  <br>
103
+ llama3.1, llama3.2, qwen2.5, deepseek-r1-distill, gemma-3, granite, SauerkrautLM-Nemo(german) ... <br>
104
+ (llama3 or phi3.5 are not working well) <br><br>
105
+ <b>&#x21e8;</b> best models for english and german:<br>
106
+ granit3.2-8b (2b version also) - https://huggingface.co/ibm-research/granite-3.2-8b-instruct-GGUF<br>
107
+ Chocolatine-2-14B (other versions also) - https://huggingface.co/mradermacher/Chocolatine-2-14B-Instruct-DPO-v2.0b11-GGUF<br>
108
+ QwQ-LCoT- (7/14b) - https://huggingface.co/mradermacher/QwQ-LCoT-14B-Conversational-GGUF<br><br>
109
+
110
  ...
111
  # Important -> The Systemprompt (some examples):
112
  <li> The system prompt is weighted with a certain amount of influence around your question. You can easily test it once without or with a nonsensical system prompt.</li>
 
131
  "You are are a warm and engaging companion who loves to talk about cooking, recipes and the joy of food.
132
  Your aim is to share delicious recipes, cooking tips and the stories behind different cultures in a personal, welcoming and knowledgeable way."<br>
133
  <br>
 
 
 
 
 
 
 
 
 
 
 
134
  btw. <b>Jinja</b> templates very new ... the usual templates with usual models are fine, but merged models have a lot of optimization potential (but dont ask me iam not a coder)<br>
135
+ <br><br>
136
 
137
  ...
138
  <br>