File size: 4,189 Bytes
76729ef |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
---
base_model: theprint/Tom-Qwen-7B-Instruct
library_name: peft
pipeline_tag: text-generation
language: en
license: apache-2.0
tags:
- lora
- sft
- transformers
- trl
- unsloth
- fine-tuned
datasets:
- theprint/ReWiz
---
# Rewiz-Tom-7B
A fine-tuned 7B parameter model specialized in reasoning (Rewiz), based on a model that was already finetuned for step-by-step instruction and conversation (Tom).
## Model Details
This model is a fine-tuned version of theprint/Tom-Qwen-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training.
- **Developed by:** theprint
- **Model type:** Causal Language Model (Fine-tuned with LoRA)
- **Language:** en
- **License:** apache-2.0
- **Base model:** theprint/Tom-Qwen-7B-Instruct
- **Fine-tuning method:** LoRA with rank 128
## Intended Use
Conversation, brainstorming, and general instruction following
## Training Details
### Training Data
The Rewiz data set is a curated mix of 20,000 reasoning-based entries.
- **Dataset:** theprint/ReWiz
- **Format:** alpaca
### Training Procedure
- **Training epochs:** 2
- **LoRA rank:** 128
- **Learning rate:** 0.0002
- **Batch size:** 4
- **Framework:** Unsloth + transformers + PEFT
- **Hardware:** NVIDIA RTX 5090
## Usage
```python
from unsloth import FastLanguageModel
import torch
# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="theprint/Rewiz-Tom-7B",
max_seq_length=4096,
dtype=None,
load_in_4bit=True,
)
# Enable inference mode
FastLanguageModel.for_inference(model)
# Example usage
inputs = tokenizer(["Your prompt here"], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Alternative Usage (Standard Transformers)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"theprint/Rewiz-Tom-7B",
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("theprint/Rewiz-Tom-7B")
# Example usage
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Your question here"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7, do_sample=True)
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)
```
## GGUF Quantized Versions
Quantized GGUF versions are available in the `gguf/` directory for use with llama.cpp:
- `Rewiz-Tom-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file)
- `Rewiz-Tom-7B-q3_k_m.gguf` (3632.0 MB) - 3-bit quantization (medium quality)
- `Rewiz-Tom-7B-q4_k_m.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases)
- `Rewiz-Tom-7B-q5_k_m.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality)
- `Rewiz-Tom-7B-q6_k.gguf` (5964.5 MB) - 6-bit quantization (high quality)
- `Rewiz-Tom-7B-q8_0.gguf` (7723.4 MB) - 8-bit quantization (very high quality)
### Using with llama.cpp
```bash
# Download a quantized version (q4_k_m recommended for most use cases)
wget https://huggingface.co/theprint/Rewiz-Tom-7B/resolve/main/gguf/Rewiz-Tom-7B-q4_k_m.gguf
# Run with llama.cpp
./llama.cpp/main -m Rewiz-Tom-7B-q4_k_m.gguf -p "Your prompt here" -n 256
```
## Limitations
May hallucinate or provide incorrect information.
## Citation
If you use this model, please cite:
```bibtex
@misc{rewiz_tom_7b,
title={Rewiz-Tom-7B: Fine-tuned theprint/Tom-Qwen-7B-Instruct},
author={theprint},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/theprint/Rewiz-Tom-7B}
}
```
## Acknowledgments
- Base model: [theprint/Tom-Qwen-7B-Instruct](https://huggingface.co/theprint/Tom-Qwen-7B-Instruct)
- Training dataset: [theprint/ReWiz](https://huggingface.co/datasets/theprint/ReWiz)
- Fine-tuning framework: [Unsloth](https://github.com/unslothai/unsloth)
- Quantization: [llama.cpp](https://github.com/ggerganov/llama.cpp)
|