--- language: - en library_name: gguf base_model: HelpingAI/Dhanishtha-2.0-preview tags: - gguf - quantized - llama.cpp license: apache-2.0 --- # HelpingAI/Dhanishtha-2.0-preview - GGUF This repository contains GGUF quantizations of [HelpingAI/Dhanishtha-2.0-preview](https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview). ## About GGUF GGUF is a quantization method that allows you to run large language models on consumer hardware by reducing the precision of the model weights. ## Files | Filename | Quant type | File Size | Description | | -------- | ---------- | --------- | ----------- | | model-fp16.gguf | FP16 | Variable | fp16 quantization | | model-q4_0.gguf | Q4_0 | Small | 4-bit quantization | | model-q4_1.gguf | Q4_1 | Small | 4-bit quantization (higher quality) | | model-q5_0.gguf | Q5_0 | Medium | 5-bit quantization | | model-q5_1.gguf | Q5_1 | Medium | 5-bit quantization (higher quality) | | model-q8_0.gguf | Q8_0 | Large | 8-bit quantization | ## Usage You can use these models with llama.cpp or any other GGUF-compatible inference engine. ### llama.cpp ```bash ./llama-cli -m model-q4_0.gguf -p "Your prompt here" ``` ### Python (using llama-cpp-python) ```python from llama_cpp import Llama llm = Llama(model_path="model-q4_0.gguf") output = llm("Your prompt here", max_tokens=512) print(output['choices'][0]['text']) ``` ## Original Model This is a quantized version of [HelpingAI/Dhanishtha-2.0-preview](https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview). Please refer to the original model card for more information about the model's capabilities, training data, and usage guidelines. ## Conversion Details - Converted using llama.cpp - Original model downloaded from Hugging Face - Multiple quantization levels provided for different use cases ## License This model inherits the license from the original model. Please check the original model's license for usage terms.