llama3-deepseek_Q8 / README.md
ykarout's picture
Update README.md
e7c08ae verified
---
license: mit
datasets:
- Jiayi-Pan/Countdown-Tasks-3to4
language:
- en
- es
- la
- ar
base_model:
- unsloth/phi-4-unsloth-bnb-4bit
pipeline_tag: text-generation
library_name: transformers
tags:
- unsloth
- gguf
- deepseek
- reasoning
- thinking
- coder
- math
- llama
- llama3
- Q8
---
# Llama3-ThinkQ8
A fine-tuned version of Llama 3 that shows explicit thinking using `<think>` and `<answer>` tags. This model is quantized to 8-bit (Q8) for efficient inference.
## Model Details
- **Base Model**: Llama 3
- **Quantization**: 8-bit (Q8)
- **Special Feature**: Explicit thinking process with tags
## How to Use with Ollama
### 1. Install Ollama
If you haven't already installed Ollama, follow the instructions at [ollama.ai](https://ollama.ai).
### 2. Download the model file
Download the GGUF file from this repository.
### 3. Create the Ollama model
Create a file named `Modelfile` with this content:
```
FROM llama3-thinkQ8.gguf
# Model parameters
PARAMETER temperature 0.8
PARAMETER top_p 0.9
# System prompt
SYSTEM """You are a helpful assistant. You will check the user request and you will think and generate brainstorming and self-thoughts in your mind and respond only in the following format:
<think> {your thoughts here} </think>
<answer> {your final answer here} </answer>. Use the tags once and place all your output inside them ONLY"""
```
Then run:
```bash
ollama create llama3-think -f Modelfile
```
### 4. Run the model
```bash
ollama run llama3-think
```
## Example Prompts
Try these examples:
```
Using each number in this tensor ONLY once (5, 8, 3) and any arithmetic operation like add, subtract, multiply, divide, create an equation that equals 19.
```
```
Explain the concept of quantum entanglement to a high school student.
```