Missing files?
I noticed that the official model has multiple files missing from this repo, and when trying to create a Quant of this model without them it throws an error about tokenizer.model being missing.
Now that I added the official models that were missing, I'm able to quantize it, but I'm wondering if these official files might mess up your abliteration or not.
Thanks for your feedback. Is tokenizer.model
the only missing file for you? The abliteration only modifies the weights so changing the tokenizer files shouldn't affect it.
Hi, the other missing files are: added_tokens.json, chat_template.json, generation_config.json, preprocessor_config.json, and processor_config.json.
As for the abliteration and tokenizer: I'm not sure what's up, but I did make a Q8_0 GGUF, and it gave me the same scolding responses for inappropriate questions as it would with the base model. I also made sure to use the suggested model settings.
Thanks! They shouldn't be required (probably more related to the quantization code) but I'll add them for convenience. I'll re-run the model to double-check that it works. Have you tried the 4B or the 27B by any chance?
I wanted to try 12b, many solutions to launch it in textgen-webui but to no avail , any guide on how to load it properly there?
I use min_p preset, and change top_p to 0.95, and top_k to 64.
For model loading i use 'load-in-4bit'
The model loads, runs, answers; however it seems to have a problem 'stopping' (TextGen will show it generating with no ending most times, but clicking stop works).
If anyone has an issue to the 'not stopping' bit that would be nice; i'm wondering if it's a stop token i'm missing.