Ba2han
/

Qwen3-30B-A3B-Geminized-v0.2

Model card Files Files and versions Community

Ba2han commited on May 16

Commit

24edf4b

·

verified ·

1 Parent(s): dee035e

Create README.md

Files changed (1) hide show

README.md +29 -0

README.md ADDED Viewed

	@@ -0,0 +1,29 @@

+---
+license: mit
+language:
+- en
+base_model:
+- Qwen/Qwen3-30B-A3B
+---
+> [!NOTE]
+> **Use "You are an assistant with reasoning capabilities." system message to trigger gemini-style thinking.**
+# Training Dataset
+- The fine-tuning dataset consists of ~450 diverse examples, 250 of which are directly from Gemini 2.5 Pro.
+## Trained on:
+- Unsloth version of Qwen3-30B-A3B (instruct).
+- 32k seq_len with examples ranging from 1k to ~20k tokens.
+- Up to 2 turns of conversations.
+---
+- No benchmark data for now.
+**Keep in mind that it's slightly overfit since the training dataset was quite small. The model can be used to create more high quality examples for further training.**
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6324eabf05bd8a54c6eb1650/TEBe1XQvpJA2IZ63btFWT.png)