Update README.md
Browse files
README.md
CHANGED
@@ -6,20 +6,68 @@ library_name: transformers
|
|
6 |
pipeline_tag: text-generation
|
7 |
tags:
|
8 |
- axolotl
|
|
|
|
|
|
|
9 |
license: apache-2.0
|
10 |
datasets:
|
11 |
- NousResearch/Hermes-3-Dataset
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
---
|
13 |
|
14 |
# Qwen3-Hermes8B-v1
|
15 |
|
16 |
-
This is a merged LoRA model based on Qwen/Qwen3-8B, SFT on Hermes3 Dataset
|
17 |
|
18 |
## Model Details
|
19 |
|
20 |
- **Base Model**: Qwen/Qwen3-8B
|
21 |
- **Language**: English (en)
|
22 |
- **Library**: transformers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
## Usage
|
25 |
|
@@ -35,14 +83,95 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
35 |
device_map="auto"
|
36 |
)
|
37 |
|
38 |
-
# Example usage
|
39 |
-
text = "
|
40 |
inputs = tokenizer(text, return_tensors="pt")
|
41 |
-
outputs = model.generate(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
43 |
print(response)
|
44 |
```
|
45 |
|
46 |
## Training Details
|
47 |
|
48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
pipeline_tag: text-generation
|
7 |
tags:
|
8 |
- axolotl
|
9 |
+
- reasoning
|
10 |
+
- math
|
11 |
+
- commonsense
|
12 |
license: apache-2.0
|
13 |
datasets:
|
14 |
- NousResearch/Hermes-3-Dataset
|
15 |
+
model-index:
|
16 |
+
- name: Qwen3-Hermes8B-v1
|
17 |
+
results:
|
18 |
+
- task:
|
19 |
+
type: text-generation
|
20 |
+
name: Text Generation
|
21 |
+
dataset:
|
22 |
+
name: HellaSwag
|
23 |
+
type: hellaswag
|
24 |
+
metrics:
|
25 |
+
- type: accuracy
|
26 |
+
value: 0.823
|
27 |
+
name: Accuracy
|
28 |
+
- task:
|
29 |
+
type: text-generation
|
30 |
+
name: Mathematical Reasoning
|
31 |
+
dataset:
|
32 |
+
name: GSM8K
|
33 |
+
type: gsm8k
|
34 |
+
metrics:
|
35 |
+
- type: accuracy
|
36 |
+
value: 0.871
|
37 |
+
name: Accuracy
|
38 |
+
- task:
|
39 |
+
type: text-generation
|
40 |
+
name: Theory of Mind
|
41 |
+
dataset:
|
42 |
+
name: TheoryPlay
|
43 |
+
type: theoryplay
|
44 |
+
metrics:
|
45 |
+
- type: accuracy
|
46 |
+
value: 0.35
|
47 |
+
name: Accuracy
|
48 |
---
|
49 |
|
50 |
# Qwen3-Hermes8B-v1
|
51 |
|
52 |
+
This is a merged LoRA model based on Qwen/Qwen3-8B, SFT on Hermes3 Dataset. The model demonstrates strong performance across reasoning, mathematical problem-solving, and commonsense understanding tasks.
|
53 |
|
54 |
## Model Details
|
55 |
|
56 |
- **Base Model**: Qwen/Qwen3-8B
|
57 |
- **Language**: English (en)
|
58 |
- **Library**: transformers
|
59 |
+
- **Training Method**: LoRA fine-tuning with Axolotl
|
60 |
+
- **Infrastructure**: 8xB200 Cluster from PrimeIntellect
|
61 |
+
- **Training Framework**: DeepSpeed Zero2
|
62 |
+
|
63 |
+
## Performance
|
64 |
+
|
65 |
+
| Benchmark | Score | Description |
|
66 |
+
|-----------|-------|-------------|
|
67 |
+
| **HellaSwag** | 82.3% | Commonsense reasoning and natural language inference |
|
68 |
+
| **GSM8K** | 87.1% | Grade school math word problems |
|
69 |
+
| **TheoryPlay** | 35% | Theory of mind and social reasoning tasks |
|
70 |
+
|
71 |
|
72 |
## Usage
|
73 |
|
|
|
83 |
device_map="auto"
|
84 |
)
|
85 |
|
86 |
+
# Example usage for reasoning tasks
|
87 |
+
text = "Sarah believes that her keys are in her purse, but they are actually on the kitchen table. Where will Sarah look for her keys?"
|
88 |
inputs = tokenizer(text, return_tensors="pt")
|
89 |
+
outputs = model.generate(
|
90 |
+
**inputs,
|
91 |
+
max_length=200,
|
92 |
+
temperature=0.1,
|
93 |
+
do_sample=True,
|
94 |
+
pad_token_id=tokenizer.eos_token_id
|
95 |
+
)
|
96 |
+
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
97 |
+
print(response)
|
98 |
+
```
|
99 |
+
|
100 |
+
### Chat Format
|
101 |
+
|
102 |
+
This model supports the Hermes chat format:
|
103 |
+
|
104 |
+
```python
|
105 |
+
def format_chat(messages):
|
106 |
+
formatted = ""
|
107 |
+
for message in messages:
|
108 |
+
role = message["role"]
|
109 |
+
content = message["content"]
|
110 |
+
if role == "system":
|
111 |
+
formatted += f"<|im_start|>system\n{content}<|im_end|>\n"
|
112 |
+
elif role == "user":
|
113 |
+
formatted += f"<|im_start|>user\n{content}<|im_end|>\n"
|
114 |
+
elif role == "assistant":
|
115 |
+
formatted += f"<|im_start|>assistant\n{content}<|im_end|>\n"
|
116 |
+
formatted += "<|im_start|>assistant\n"
|
117 |
+
return formatted
|
118 |
+
|
119 |
+
messages = [
|
120 |
+
{"role": "system", "content": "You are a helpful assistant."},
|
121 |
+
{"role": "user", "content": "Solve this math problem: A store has 45 apples. If they sell 1/3 of them in the morning and 1/5 of the remaining apples in the afternoon, how many apples are left?"}
|
122 |
+
]
|
123 |
+
|
124 |
+
prompt = format_chat(messages)
|
125 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
126 |
+
outputs = model.generate(**inputs, max_length=300, temperature=0.1)
|
127 |
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
128 |
print(response)
|
129 |
```
|
130 |
|
131 |
## Training Details
|
132 |
|
133 |
+
- **Training Framework**: Axolotl with DeepSpeed Zero2 optimization
|
134 |
+
- **Hardware**: 8x NVIDIA B200 GPUs (PrimeIntellect cluster)
|
135 |
+
- **Base Model**: Qwen/Qwen3-8B
|
136 |
+
- **Training Method**: Low-Rank Adaptation (LoRA)
|
137 |
+
- **Dataset**: NousResearch/Hermes-3-Dataset
|
138 |
+
- **Training Duration**: 6 hours
|
139 |
+
- **Learning Rate**: 0.0004
|
140 |
+
- **Batch Size**: 8
|
141 |
+
- **Sequence Length**: 4096
|
142 |
+
|
143 |
+
## Evaluation Methodology
|
144 |
+
|
145 |
+
All evaluations were conducted using:
|
146 |
+
- **HellaSwag**: Standard validation set with 4-way multiple choice accuracy
|
147 |
+
- **GSM8K**: Test set with exact match accuracy on final numerical answers
|
148 |
+
- **TheoryPlay**: Validation set with accuracy on theory of mind reasoning tasks
|
149 |
+
|
150 |
+
## Limitations
|
151 |
+
|
152 |
+
- The model may still struggle with very complex mathematical proofs
|
153 |
+
- Performance on non-English languages may be limited
|
154 |
+
- May occasionally generate inconsistent responses in edge cases
|
155 |
+
- Training data cutoff affects knowledge of recent events
|
156 |
+
|
157 |
+
## Ethical Considerations
|
158 |
+
|
159 |
+
This model has been trained on curated datasets and should be used responsibly. Users should:
|
160 |
+
- Verify important information from the model
|
161 |
+
- Be aware of potential biases in training data
|
162 |
+
- Use appropriate content filtering for production applications
|
163 |
+
|
164 |
+
## Citation
|
165 |
+
|
166 |
+
```bibtex
|
167 |
+
@misc{qwen3-hermes8b-v1,
|
168 |
+
title={Qwen3-Hermes8B-v1: A Fine-tuned Language Model for Reasoning Tasks},
|
169 |
+
author={[Your Name]},
|
170 |
+
year={2025},
|
171 |
+
url={https://huggingface.co/justinj92/Qwen3-Hermes8B-v1}
|
172 |
+
}
|
173 |
+
```
|
174 |
+
|
175 |
+
## License
|
176 |
+
|
177 |
+
This model is released under the Apache 2.0 license.
|