Delta-Vector commited on
Commit
027b543
·
verified ·
1 Parent(s): a9e0250

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +370 -20
README.md CHANGED
@@ -1,35 +1,385 @@
1
  ---
2
- base_model:
3
- - Delta-Vector/Francois-PE-12B
4
- - NewEden/francois-PE-kto-r1
5
- library_name: transformers
6
  tags:
7
- - mergekit
8
- - merge
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
- # kto-francois-nemo-v2
12
 
13
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
 
15
- ## Merge Details
16
- ### Merge Method
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
- This model was merged using the passthrough merge method using [Delta-Vector/Francois-PE-12B](https://huggingface.co/Delta-Vector/Francois-PE-12B) + [NewEden/francois-PE-kto-r1](https://huggingface.co/NewEden/francois-PE-kto-r1) as a base.
19
 
20
- ### Models Merged
 
21
 
22
- The following models were included in the merge:
23
 
24
 
25
- ### Configuration
26
 
27
- The following YAML configuration was used to produce this model:
28
 
 
29
  ```yaml
30
- base_model: Delta-Vector/Francois-PE-12B+NewEden/francois-PE-kto-r1
31
- dtype: bfloat16
32
- merge_method: passthrough
33
- models:
34
- - model: Delta-Vector/Francois-PE-12B+NewEden/francois-PE-kto-r1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
 
2
  tags:
3
+ - chat
4
+ datasets:
5
+ - NewEden/OpenCAI-ShareGPT
6
+ - NewEden/vanilla-backrooms-claude-sharegpt
7
+ - anthracite-org/kalo_opus_misc_240827
8
+ - anthracite-org/kalo_misc_part2
9
+ - NewEden/RP-logs-V2-Experimental
10
+ - NewEden/BlueSky-Experimental-sharegpt
11
+ - NewEden/Misc-Mang-Sharegpt
12
+ - NewEden/Opus-accepted-hermes-rejected-shuffled
13
+ Language:
14
+ - En
15
+ Pipeline_tag: text-generation
16
+ Base_model: Delta-Vector/Francois-PE-12B
17
+ Tags:
18
+ - Chat
19
+ ---
20
+
21
+
22
+
23
+ A finetune ontop of the orginial Francois-PE model that incorporates KTO to increase coherency and prose. The model aims to have short and sweet prose.
24
+
25
+
26
+ # Quants
27
+
28
+ GGUF:
29
+
30
+ EXL2 :
31
+
32
+
33
+ ## Prompting
34
+ Model has been tuned with the ChatML formatting. A typical input would look like this:
35
+
36
+ ```py
37
+ """<|im_start|>user
38
+ Hi there!<|im_end|>
39
+ <|im_start|>assistant
40
+ Nice to meet you!<|im_end|>
41
+ <|im_start|>user
42
+ Can I ask a question?<|im_end|>
43
+ <|im_start|>assistant
44
+ """
45
+
46
+ ```
47
+
48
+
49
+ ## System Prompting
50
+
51
+ I would highly recommend using either Euryale's system prompt or the EVA system prompt with the model.
52
+
53
+ <details><summary>See Sao10k's Euryale System Prompt</summary>
54
+
55
+ ```
56
+ Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}.
57
+ <Guidelines>
58
+ • Maintain the character persona but allow it to evolve with the story.
59
+ • Be creative and proactive. Drive the story forward, introducing plotlines and events when relevant.
60
+ • All types of outputs are encouraged; respond accordingly to the narrative.
61
+ • Include dialogues, actions, and thoughts in each response.
62
+ • Utilize all five senses to describe scenarios within {{char}}'s dialogue.
63
+ • Use emotional symbols such as "!" and "~" in appropriate contexts.
64
+ • Incorporate onomatopoeia when suitable.
65
+ • Allow time for {{user}} to respond with their own input, respecting their agency.
66
+ • Act as secondary characters and NPCs as needed, and remove them when appropriate.
67
+ • When prompted for an Out of Character [OOC:] reply, answer neutrally and in plaintext, not as {{char}}.
68
+ </Guidelines>
69
+
70
+ <Forbidden>
71
+ • Using excessive literary embellishments and purple prose unless dictated by {{char}}'s persona.
72
+ • Writing for, speaking, thinking, acting, or replying as {{user}} in your response.
73
+ • Repetitive and monotonous outputs.
74
+ • Positivity bias in your replies.
75
+ • Being overly extreme or NSFW when the narrative context is inappropriate.
76
+ </Forbidden>Thanks to Po
77
+
78
+ Follow the instructions in <Guidelines></Guidelines>, avoiding the items listed in <Forbidden></Forbidden>.
79
+
80
+ ```
81
+ </details><br>
82
+
83
 
84
+
85
+ ## Axolotl config
86
+
87
+ <details><summary>See axolotl config</summary>
88
+
89
+ Axolotl version: ` 0.5.0`
90
+ ```yaml
91
+ base_model: Delta-Vector_Francois-PE-12B
92
+
93
+ load_in_8bit: false
94
+ load_in_4bit: false
95
+ strict: false
96
+
97
+ rl: kto
98
+ kto_undesirable_weight: 1.0
99
+
100
+ #datasets:
101
+ # - ds_type: json
102
+ # data_files:
103
+ # - NewEden/Ohashi-accepted-Hermes-rejected
104
+ # split: train
105
+ # type: chatml.argilla
106
+ datasets:
107
+ - path: NewEden/Opus-accepted-hermes-rejected-shuffled
108
+ split: train
109
+ type: chatml.argilla
110
+ dataset_prepared_path: last_run_prepared
111
+ val_set_size: 0.0
112
+ output_dir: ./francois-PE-kto-r1
113
+
114
+ remove_unused_columns: false
115
+
116
+ adapter: lora
117
+ lora_model_dir:
118
+
119
+ sequence_len: 8192
120
+ pad_to_sequence_len: false
121
+
122
+ lora_r: 64
123
+ lora_alpha: 32
124
+ lora_dropout: 0.0
125
+ lora_target_linear: true
126
+ lora_fan_in_fan_out:
127
+ lora_target_modules:
128
+ - gate_proj
129
+ - down_proj
130
+ - up_proj
131
+ - q_proj
132
+ - v_proj
133
+ - k_proj
134
+ - o_proj
135
+
136
+ wandb_project: KTO-NeMo
137
+ wandb_entity:
138
+ wandb_watch:
139
+ wandb_name: Ohashi-accepted-hermes-rejected-r1
140
+ wandb_log_model:
141
+
142
+ gradient_accumulation_steps: 4
143
+ micro_batch_size: 2
144
+ num_epochs: 1---
145
+ tags:
146
+ - chat
147
+ datasets:
148
+ - NewEden/OpenCAI-ShareGPT
149
+ - NewEden/vanilla-backrooms-claude-sharegpt
150
+ - anthracite-org/kalo_opus_misc_240827
151
+ - anthracite-org/kalo_misc_part2
152
+ - NewEden/RP-logs-V2-Experimental
153
+ - NewEden/BlueSky-Experimental-sharegpt
154
+ - NewEden/Misc-Mang-Sharegpt
155
+ - NewEden/Opus-accepted-hermes-rejected-shuffled
156
+ Language:
157
+ - En
158
+ Pipeline_tag: text-generation
159
+ Base_model: Delta-Vector/Francois-PE-12B
160
+ Tags:
161
+ - Chat
162
  ---
 
163
 
 
164
 
165
+ A finetune ontop of the orginial Francois-PE model that incorporates KTO to increase coherency and prose. The model aims to have short and sweet prose.
166
+
167
+
168
+ # Quants
169
+
170
+ GGUF: https://huggingface.co/Delta-Vector/Francois-Huali-12B-gguf
171
+
172
+ EXL2 : https://huggingface.co/Delta-Vector/Francois-Huali-12B-exl2
173
+
174
+
175
+ ## Prompting
176
+ Model has been tuned with the ChatML formatting. A typical input would look like this:
177
+
178
+ ```py
179
+ """<|im_start|>user
180
+ Hi there!<|im_end|>
181
+ <|im_start|>assistant
182
+ Nice to meet you!<|im_end|>
183
+ <|im_start|>user
184
+ Can I ask a question?<|im_end|>
185
+ <|im_start|>assistant
186
+ """
187
+
188
+ ```
189
+
190
+
191
+ ## System Prompting
192
+
193
+ I would highly recommend using either Euryale's system prompt or the EVA system prompt with the model.
194
+
195
+ <details><summary>See Sao10k's Euryale System Prompt</summary>
196
+
197
+ ```
198
+ Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}.
199
+ <Guidelines>
200
+ • Maintain the character persona but allow it to evolve with the story.
201
+ • Be creative and proactive. Drive the story forward, introducing plotlines and events when relevant.
202
+ • All types of outputs are encouraged; respond accordingly to the narrative.
203
+ • Include dialogues, actions, and thoughts in each response.
204
+ • Utilize all five senses to describe scenarios within {{char}}'s dialogue.
205
+ • Use emotional symbols such as "!" and "~" in appropriate contexts.
206
+ • Incorporate onomatopoeia when suitable.
207
+ • Allow time for {{user}} to respond with their own input, respecting their agency.
208
+ • Act as secondary characters and NPCs as needed, and remove them when appropriate.
209
+ • When prompted for an Out of Character [OOC:] reply, answer neutrally and in plaintext, not as {{char}}.
210
+ </Guidelines>
211
+
212
+ <Forbidden>
213
+ • Using excessive literary embellishments and purple prose unless dictated by {{char}}'s persona.
214
+ • Writing for, speaking, thinking, acting, or replying as {{user}} in your response.
215
+ • Repetitive and monotonous outputs.
216
+ • Positivity bias in your replies.
217
+ • Being overly extreme or NSFW when the narrative context is inappropriate.
218
+ </Forbidden>Thanks to Po
219
 
220
+ Follow the instructions in <Guidelines></Guidelines>, avoiding the items listed in <Forbidden></Forbidden>.
221
 
222
+ ```
223
+ </details><br>
224
 
 
225
 
226
 
227
+ ## Axolotl config
228
 
229
+ <details><summary>See axolotl config</summary>
230
 
231
+ Axolotl version: ` 0.5.0`
232
  ```yaml
233
+ base_model: Delta-Vector_Francois-PE-12B
234
+
235
+ load_in_8bit: false
236
+ load_in_4bit: false
237
+ strict: false
238
+
239
+ rl: kto
240
+ kto_undesirable_weight: 1.0
241
+
242
+ #datasets:
243
+ # - ds_type: json
244
+ # data_files:
245
+ # - NewEden/Ohashi-accepted-Hermes-rejected
246
+ # split: train
247
+ # type: chatml.argilla
248
+ datasets:
249
+ - path: NewEden/Opus-accepted-hermes-rejected-shuffled
250
+ split: train
251
+ type: chatml.argilla
252
+ dataset_prepared_path: last_run_prepared
253
+ val_set_size: 0.0
254
+ output_dir: ./francois-PE-kto-r1
255
+
256
+ remove_unused_columns: false
257
+
258
+ adapter: lora
259
+ lora_model_dir:
260
+
261
+ sequence_len: 8192
262
+ pad_to_sequence_len: false
263
+
264
+ lora_r: 64
265
+ lora_alpha: 32
266
+ lora_dropout: 0.0
267
+ lora_target_linear: true
268
+ lora_fan_in_fan_out:
269
+ lora_target_modules:
270
+ - gate_proj
271
+ - down_proj
272
+ - up_proj
273
+ - q_proj
274
+ - v_proj
275
+ - k_proj
276
+ - o_proj
277
+
278
+ wandb_project: KTO-NeMo
279
+ wandb_entity:
280
+ wandb_watch:
281
+ wandb_name: Ohashi-accepted-hermes-rejected-r1
282
+ wandb_log_model:
283
+
284
+ gradient_accumulation_steps: 4
285
+ micro_batch_size: 2
286
+ num_epochs: 1
287
+ optimizer: paged_adamw_8bit
288
+ lr_scheduler: constant_with_warmup
289
+ learning_rate: 1e-6
290
+ max_grad_norm: 0.01
291
+
292
+ train_on_inputs: false
293
+ group_by_length: false
294
+ bf16: auto
295
+ fp16:
296
+ tf32: true
297
+
298
+ gradient_checkpointing: unsloth
299
+ early_stopping_patience:
300
+ resume_from_checkpoint:
301
+ local_rank:
302
+ logging_steps: 1
303
+ xformers_attention:
304
+ flash_attention: true
305
+
306
+ warmup_steps: 25
307
+ evals_per_epoch: 4
308
+ eval_table_size:
309
+ eval_max_new_tokens: 128
310
+ saves_per_epoch: 1
311
+ debug:
312
+ deepspeed: /workspace/axolotl/deepspeed_configs/zero3_bf16_cpuoffload_params.json
313
+ weight_decay: 0.0
314
+ fsdp:
315
+ fsdp_config:
316
+
317
+
318
  ```
319
+
320
+ </details><br>
321
+
322
+ ## Credits
323
+
324
+ Thank you to [Lucy Knada](https://huggingface.co/lucyknada), [Intervitens](https://huggingface.co/intervitens),[Cgato](https://huggingface.co/cgato), [Kubernetes Bad](https://huggingface.co/kubernetes-bad) and the rest of [Anthracite](https://huggingface.co/anthracite-org)
325
+
326
+
327
+ ## Training
328
+ The training was done for 1 epochs We used 4 x [RTX 3090s](https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3090-3090ti/) GPUs graciously provided by [Intervitens](https://huggingface.co/intervitens) for the fine-tuning of the model.
329
+
330
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
331
+
332
+ ## Safety
333
+
334
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66c26b6fb01b19d8c3c2467b/bL0o_4bvbkmzAvK3W8gu2.png)
335
+
336
+
337
+ optimizer: paged_adamw_8bit
338
+ lr_scheduler: constant_with_warmup
339
+ learning_rate: 1e-6
340
+ max_grad_norm: 0.01
341
+
342
+ train_on_inputs: false
343
+ group_by_length: false
344
+ bf16: auto
345
+ fp16:
346
+ tf32: true
347
+
348
+ gradient_checkpointing: unsloth
349
+ early_stopping_patience:
350
+ resume_from_checkpoint:
351
+ local_rank:
352
+ logging_steps: 1
353
+ xformers_attention:
354
+ flash_attention: true
355
+
356
+ warmup_steps: 25
357
+ evals_per_epoch: 4
358
+ eval_table_size:
359
+ eval_max_new_tokens: 128
360
+ saves_per_epoch: 1
361
+ debug:
362
+ deepspeed: /workspace/axolotl/deepspeed_configs/zero3_bf16_cpuoffload_params.json
363
+ weight_decay: 0.0
364
+ fsdp:
365
+ fsdp_config:
366
+
367
+
368
+ ```
369
+
370
+ </details><br>
371
+
372
+ ## Credits
373
+
374
+ Thank you to [Lucy Knada](https://huggingface.co/lucyknada), [Intervitens](https://huggingface.co/intervitens),[Cgato](https://huggingface.co/cgato), [Kubernetes Bad](https://huggingface.co/kubernetes-bad) and the rest of [Anthracite](https://huggingface.co/anthracite-org)
375
+
376
+
377
+ ## Training
378
+ The training was done for 1 epochs We used 4 x [RTX 3090s](https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3090-3090ti/) GPUs graciously provided by [Intervitens](https://huggingface.co/intervitens) for the fine-tuning of the model.
379
+
380
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
381
+
382
+ ## Safety
383
+
384
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66c26b6fb01b19d8c3c2467b/bL0o_4bvbkmzAvK3W8gu2.png)
385
+