Update readme
Browse files
README.md
ADDED
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- unsloth
|
5 |
+
- trl
|
6 |
+
- sft
|
7 |
+
- medical
|
8 |
+
- reasoning
|
9 |
+
- abliterated
|
10 |
+
- baukit-abliterated
|
11 |
+
datasets:
|
12 |
+
- FreedomIntelligence/medical-o1-reasoning-SFT
|
13 |
+
language:
|
14 |
+
- en
|
15 |
+
base_model:
|
16 |
+
- suayptalha/Qwen3-0.6B-Medical-Expert
|
17 |
+
pipeline_tag: text-generation
|
18 |
+
library_name: transformers
|
19 |
+
---
|
20 |
+
|
21 |
+
# Qwen3-0.6B-Medical-Expert (Abliterated)
|
22 |
+
|
23 |
+
This project performs full fine-tuning on the **Qwen3-0.6B** language model to enhance its **medical reasoning** and **clinical understanding** capabilities. Training was conducted on the `FreedomIntelligence/medical-o1-reasoning-SFT` dataset using bfloat16 (bf16) precision for efficient optimization.
|
24 |
+
Additionally, it has been abliterated to make it steer away from censorship.
|
25 |
+
|
26 |
+
## Training Procedure
|
27 |
+
|
28 |
+
1. **Dataset Preparation**
|
29 |
+
|
30 |
+
* The `FreedomIntelligence/medical-o1-reasoning-SFT` dataset was used.
|
31 |
+
* Each example consists of medically relevant instructions or questions paired with detailed, step-by-step clinical reasoning responses.
|
32 |
+
* Prompts were structured to encourage safe, factual, and coherent medical reasoning chains.
|
33 |
+
|
34 |
+
2. **Model Loading and Configuration**
|
35 |
+
|
36 |
+
* Qwen3 base model weights were loaded via the `unsloth` library in bf16 precision.
|
37 |
+
* All model layers were fully updated (`full_finetuning=True`) to effectively adapt the model to medical reasoning and decision-making tasks.
|
38 |
+
|
39 |
+
3. **Supervised Fine-Tuning**
|
40 |
+
|
41 |
+
* Fine-tuning was conducted using the Hugging Face TRL library with the Supervised Fine-Tuning (SFT) approach.
|
42 |
+
* The model was trained to follow clinical instructions, interpret symptoms, and generate reasoned diagnoses or treatment suggestions.
|
43 |
+
|
44 |
+
## Purpose and Outcome
|
45 |
+
|
46 |
+
* The model’s ability to interpret medical instructions and generate step-by-step clinical reasoning has been significantly enhanced.
|
47 |
+
* It produces responses that combine factual accuracy with transparent reasoning, making it useful in educational and assistive medical AI contexts.
|
48 |
+
|
49 |
+
## License
|
50 |
+
|
51 |
+
This project is licensed under the Apache License 2.0. See the [LICENSE](./LICENSE) file for details.
|
52 |
+
|
53 |
+
## Support
|
54 |
+
|
55 |
+
<a href="https://www.buymeacoffee.com/suayptalha" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>
|