Nondzu
/

PLLuM-8x7B-chat-GGUF

GGUF

Polish

conversational

Model card Files Files and versions Community

Nondzu commited on Feb 24

Commit

54ad099

verified ·

1 Parent(s): fa44227

Update README.md

Browse files

Files changed (1) hide show

README.md +61 -3

README.md CHANGED Viewed

@@ -1,3 +1,61 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- pl
+base_model:
+- CYFRAGOVPL/PLLuM-8x7B-chat
+---
+# PLLuM-8x7B-chat GGUF Quantizations by Nondzu
+DISCLAIMER: This is state of the art quantized model. I am not the author of the original model. I am only hosting the quantized models. I do not take any responsibility for the models.
+This repository contains GGUF quantized versions of the [PLLuM-8x7B-chat](https://huggingface.co/CYFRAGOVPL/PLLuM-8x7B-chat) model. All quantizations were performed using the  [llama.cpp](https://github.com/ggerganov/llama.cpp) (release [b4765](https://github.com/ggml-org/llama.cpp/releases/tag/b4765)). These quantized models can be run in [LM Studio](https://lmstudio.ai/) or any other llama.cpp–based project.
+## Prompt Format
+Use the following prompt structure:
+```
+???
+```
+## Available Files
+Below is a list of available quantized model files along with their quantization type, file size, and a short description.
+| Filename                                                                              | Quant Type | File Size | Description                                                                                   |
+| ------------------------------------------------------------------------------------- | ---------- | --------- | --------------------------------------------------------------------------------------------- |
+| [PLLuM-8x7B-chat-Q2_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)    | Q2_K       | 17 GB     | Very low quality but surprisingly usable.                                                   |
+| [PLLuM-8x7B-chat-Q3_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)    | Q3_K       | 21 GB     | Low quality, suitable for setups with very limited RAM.                                       |
+| [PLLuM-8x7B-chat-Q3_K_L.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)  | Q3_K_L     | 23 GB     | High quality; recommended for quality-focused usage.                                          |
+| [PLLuM-8x7B-chat-Q3_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)  | Q3_K_M     | 21 GB     | Very high quality, near perfect output – recommended.                                         |
+| [PLLuM-8x7B-chat-Q3_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)  | Q3_K_S     | 20 GB     | Moderate quality with improved space efficiency.                                              |
+| [PLLuM-8x7B-chat-Q4_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)    | Q4_K       | 27 GB     | Good quality for standard use.                                                                |
+| [PLLuM-8x7B-chat-Q4_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)  | Q4_K_M     | 27 GB     | Default quality for most use cases – recommended.                                             |
+| [PLLuM-8x7B-chat-Q4_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)  | Q4_K_S     | 25 GB     | Slightly lower quality with enhanced space savings – recommended when size is a priority.       |
+| [PLLuM-8x7B-chat-Q5_0.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)    | Q5_0       | 31 GB     | Extremely high quality – the maximum quant available.                                         |
+| [PLLuM-8x7B-chat-Q5_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)    | Q5_K       | 31 GB     | Very high quality – recommended for demanding use cases.                                      |
+| [PLLuM-8x7B-chat-Q5_K_M.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)  | Q5_K_M     | 31 GB     | High quality – recommended.                                                                   |
+| [PLLuM-8x7B-chat-Q5_K_S.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)  | Q5_K_S     | 31 GB     | High quality, offered as an alternative with minimal quality loss.                            |
+| [PLLuM-8x7B-chat-Q6_K.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)    | Q6_K       | 36 GB     | Very high quality with quantized embed/output weights.                                        |
+| [PLLuM-8x7B-chat-Q8_0.gguf](https://huggingface.co/Nondzu/PLLuM-8x7B-chat-GGUF/tree/main)    | Q8_0       | 47 GB     | Maximum quality quantization.                                                                 |
+## Downloading Using Hugging Face CLI
+<details>
+  <summary>Click to view download instructions</summary>
+First, ensure you have the Hugging Face CLI installed:
+```bash
+pip install -U "huggingface_hub[cli]"
+```
+Then, target a specific file to download:
+```bash
+huggingface-cli download Nondzu/PLLuM-8x7B-chat-GGUF --include "PLLuM-8x7B-chat-Q4_K_M.gguf" --local-dir ./
+```
+For larger files, you can specify a new local directory (e.g., `PLLuM-8x7B-chat-Q8_0`) or download them directly into the current directory (`./`).
+</details>