Quantization made by Richard Erkhov. [Github](https://github.com/RichardErkhov) [Discord](https://discord.gg/pvy7H8DZMG) [Request more models](https://github.com/RichardErkhov/quant_request) Mistral-Nemo-Instruct-bellman-12b - GGUF - Model creator: https://huggingface.co/neph1/ - Original model: https://huggingface.co/neph1/Mistral-Nemo-Instruct-bellman-12b/ | Name | Quant method | Size | | ---- | ---- | ---- | | [Mistral-Nemo-Instruct-bellman-12b.Q2_K.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q2_K.gguf) | Q2_K | 4.46GB | | [Mistral-Nemo-Instruct-bellman-12b.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q3_K_S.gguf) | Q3_K_S | 5.15GB | | [Mistral-Nemo-Instruct-bellman-12b.Q3_K.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q3_K.gguf) | Q3_K | 5.67GB | | [Mistral-Nemo-Instruct-bellman-12b.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q3_K_M.gguf) | Q3_K_M | 5.67GB | | [Mistral-Nemo-Instruct-bellman-12b.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q3_K_L.gguf) | Q3_K_L | 6.11GB | | [Mistral-Nemo-Instruct-bellman-12b.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.IQ4_XS.gguf) | IQ4_XS | 6.33GB | | [Mistral-Nemo-Instruct-bellman-12b.Q4_0.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q4_0.gguf) | Q4_0 | 6.59GB | | [Mistral-Nemo-Instruct-bellman-12b.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.IQ4_NL.gguf) | IQ4_NL | 6.65GB | | [Mistral-Nemo-Instruct-bellman-12b.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q4_K_S.gguf) | Q4_K_S | 6.63GB | | [Mistral-Nemo-Instruct-bellman-12b.Q4_K.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q4_K.gguf) | Q4_K | 6.96GB | | [Mistral-Nemo-Instruct-bellman-12b.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q4_K_M.gguf) | Q4_K_M | 6.96GB | | [Mistral-Nemo-Instruct-bellman-12b.Q4_1.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q4_1.gguf) | Q4_1 | 7.26GB | | [Mistral-Nemo-Instruct-bellman-12b.Q5_0.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q5_0.gguf) | Q5_0 | 7.93GB | | [Mistral-Nemo-Instruct-bellman-12b.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q5_K_S.gguf) | Q5_K_S | 7.93GB | | [Mistral-Nemo-Instruct-bellman-12b.Q5_K.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q5_K.gguf) | Q5_K | 8.13GB | | [Mistral-Nemo-Instruct-bellman-12b.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q5_K_M.gguf) | Q5_K_M | 8.13GB | | [Mistral-Nemo-Instruct-bellman-12b.Q5_1.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q5_1.gguf) | Q5_1 | 8.61GB | | [Mistral-Nemo-Instruct-bellman-12b.Q6_K.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q6_K.gguf) | Q6_K | 9.37GB | | [Mistral-Nemo-Instruct-bellman-12b.Q8_0.gguf](https://huggingface.co/RichardErkhov/neph1_-_Mistral-Nemo-Instruct-bellman-12b-gguf/blob/main/Mistral-Nemo-Instruct-bellman-12b.Q8_0.gguf) | Q8_0 | 12.13GB | Original model description: --- language: - sv license: apache-2.0 library_name: transformers tags: - unsloth datasets: - neph1/bellman-7b-finetune - neph1/codefeedback-swedish base_model: - mistralai/Mistral-Nemo-Instruct-2407 --- # Model Card for Bellman This version of bellman is finetuned from Mistral-Nemo-Instruct-2407. It's a rank 128 qlora trained for about 1 epoch. It's finetuned for prompt question answering, based on a dataset created from Swedish wikipedia, with a lot of Sweden-centric questions. New from previous versions is questions from a translated code-feedback dataset, as well as a number of stories. Consider this a work in progress as I adjust the training for this new model size. Will provide a few updates to the model. For GGUFs, please look to:
https://huggingface.co/mradermacher/Mistral-Nemo-Instruct-bellman-12b-GGUF and
https://huggingface.co/mradermacher/Mistral-Nemo-Instruct-bellman-12b-i1-GGUF ![image/png](https://cdn-uploads.huggingface.co/production/uploads/653cd3049107029eb004f968/IDGX3d9lGe6yx-yHjsrav.png) [![ko-fi](https://ko-fi.com/img/githubbutton_sm.svg)](https://ko-fi.com/T6T3S8VXY) ## Model Details Rank: 128 Trained with Unsloth on a 3090 Differences from base model: The base model is pretty good at Swedish already, but my 'vibe check' says this finetune is slightly improved. There are less weird wordings. Bellman is trained on fairly short answers and tends to be less verbose. ### Training Parameters per_device_train_batch_size = 2,
gradient_accumulation_steps = 64,
num_train_epochs=3,
warmup_steps = 5,
learning_rate = 1e-4,
logging_steps = 15,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 3407,
per_device_eval_batch_size = 2,
evaluation_strategy="steps",
eval_accumulation_steps = 64,
eval_steps = 15,
eval_delay = 0,
save_strategy="steps",
save_steps=50,
### Model Description - **Developed by:** Me - **Funded by:** Me - **Model type:** Instruct - **Language(s) (NLP):** Swedish - **License:** Apache 2 License - **Finetuned from model:** Mistral-Nemo-Instruct-2407 ## Model Card Contact rickard@mindemia.com