Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,64 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
library_name: gguf
|
5 |
+
base_model: HelpingAI/Dhanishtha-2.0-preview
|
6 |
+
tags:
|
7 |
+
- gguf
|
8 |
+
- quantized
|
9 |
+
- llama.cpp
|
10 |
+
license: apache-2.0
|
11 |
+
---
|
12 |
+
|
13 |
+
# HelpingAI/Dhanishtha-2.0-preview - GGUF
|
14 |
+
|
15 |
+
This repository contains GGUF quantizations of [HelpingAI/Dhanishtha-2.0-preview](https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview).
|
16 |
+
|
17 |
+
## About GGUF
|
18 |
+
|
19 |
+
GGUF is a quantization method that allows you to run large language models on consumer hardware by reducing the precision of the model weights.
|
20 |
+
|
21 |
+
## Files
|
22 |
+
|
23 |
+
| Filename | Quant type | File Size | Description |
|
24 |
+
| -------- | ---------- | --------- | ----------- |
|
25 |
+
| model-fp16.gguf | FP16 | Variable | fp16 quantization |
|
26 |
+
| model-q4_0.gguf | Q4_0 | Small | 4-bit quantization |
|
27 |
+
| model-q4_1.gguf | Q4_1 | Small | 4-bit quantization (higher quality) |
|
28 |
+
| model-q5_0.gguf | Q5_0 | Medium | 5-bit quantization |
|
29 |
+
| model-q5_1.gguf | Q5_1 | Medium | 5-bit quantization (higher quality) |
|
30 |
+
| model-q8_0.gguf | Q8_0 | Large | 8-bit quantization |
|
31 |
+
|
32 |
+
## Usage
|
33 |
+
|
34 |
+
You can use these models with llama.cpp or any other GGUF-compatible inference engine.
|
35 |
+
|
36 |
+
### llama.cpp
|
37 |
+
|
38 |
+
```bash
|
39 |
+
./llama-cli -m model-q4_0.gguf -p "Your prompt here"
|
40 |
+
```
|
41 |
+
|
42 |
+
### Python (using llama-cpp-python)
|
43 |
+
|
44 |
+
```python
|
45 |
+
from llama_cpp import Llama
|
46 |
+
|
47 |
+
llm = Llama(model_path="model-q4_0.gguf")
|
48 |
+
output = llm("Your prompt here", max_tokens=512)
|
49 |
+
print(output['choices'][0]['text'])
|
50 |
+
```
|
51 |
+
|
52 |
+
## Original Model
|
53 |
+
|
54 |
+
This is a quantized version of [HelpingAI/Dhanishtha-2.0-preview](https://huggingface.co/HelpingAI/Dhanishtha-2.0-preview). Please refer to the original model card for more information about the model's capabilities, training data, and usage guidelines.
|
55 |
+
|
56 |
+
## Conversion Details
|
57 |
+
|
58 |
+
- Converted using llama.cpp
|
59 |
+
- Original model downloaded from Hugging Face
|
60 |
+
- Multiple quantization levels provided for different use cases
|
61 |
+
|
62 |
+
## License
|
63 |
+
|
64 |
+
This model inherits the license from the original model. Please check the original model's license for usage terms.
|