Update README.md
Browse files
README.md
CHANGED
@@ -10,4 +10,10 @@ language:
|
|
10 |
base_model:
|
11 |
- deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
|
12 |
new_version: kxdw2580/DeepSeek-R1-0528-Qwen3-8B-catgirl-v2.5
|
13 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
base_model:
|
11 |
- deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
|
12 |
new_version: kxdw2580/DeepSeek-R1-0528-Qwen3-8B-catgirl-v2.5
|
13 |
+
---
|
14 |
+
|
15 |
+
We have released the updated v2-qwen dataset , designed to evaluate performance advantages of large-scale models.
|
16 |
+
|
17 |
+
To address limitations in previous model iterations, we implemented a hybrid fine-tuning approach combining v2-common with other v2-qwen subsets. This significantly reduced redundant reasoning processes and hallucinations in routine responses, while improvements were also observed in non-reasoning modes .
|
18 |
+
|
19 |
+
Additionally, during fine-tuning, LoRA + bitsandbytes 8-bit quantization was employed to accelerate training. The model's efficiency may be compromised compared to fully-precision models.
|