File size: 394 Bytes
a3484a3
 
 
 
 
 
1
2
3
4
5
6
Trained on 100k dumped messages from the 'chan' todd proxy. I could not dedupe the dataset but it has had serious
effect on the llama7b I used. Calls me master a whole bunch more now.

Content isn't SFW so be aware. Trained in 4-bit for 3 epochs, I think it overfit and really needed just 2.

Tested in 4-bit and FP16 on plain HF llama-7b, maybe it works on derivative models of the same beaks.