Daemontatox commited on
Commit
38938d2
·
verified ·
1 Parent(s): f3050c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -6
README.md CHANGED
@@ -11,12 +11,58 @@ language:
11
  - en
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** Daemontatox
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/qwen3-32b-bnb-4bit
19
 
20
- This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  - en
12
  ---
13
 
 
14
 
15
+ ![image](./image.png)
16
+ # Manticore-32B
 
17
 
18
+ **Developed by:** Daemontatox
19
+ **License:** Apache-2.0
20
+ **Finetuned from:** [unsloth/qwen3-32b-unsloth](https://huggingface.co/unsloth/qwen3-32b-unsloth)
21
 
22
+ ## Model Overview
23
+
24
+ **Manticore-32B** is a fine-tuned version of Qwen3-32B using the high-quality **OpenThoughts2-1M** dataset. Fine-tuned with Unsloth’s TRL-compatible framework and LoRA for efficient performance, this model is optimized for **advanced reasoning tasks**, including **math**, **logic puzzles**, **code generation**, and **step-by-step problem solving**.
25
+
26
+ ## Training Dataset
27
+
28
+ - **Dataset:** [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M)
29
+ - **Source:** A synthetic dataset curated and expanded by the OpenThoughts team
30
+ - **Volume:** ~1.1M high-quality examples
31
+ - **Content Type:** Multi-turn reasoning, math proofs, algorithmic code generation, logical deduction, and structured conversations
32
+ - **Tools Used:** [Curator Viewer](https://curator.bespokelabs.ai/)
33
+
34
+ This dataset builds upon OpenThoughts-114k and integrates strong reasoning-centric data sources like OpenR1-Math and KodCode.
35
+
36
+ ## Intended Use
37
+
38
+ This model is particularly suited for:
39
+
40
+ - Chain-of-thought and step-by-step reasoning
41
+ - Code generation with logical structure
42
+ - Educational tools for math and programming
43
+ - AI agents requiring multi-turn problem-solving
44
+
45
+ ## Limitations
46
+
47
+ - English-only focus (does not generalize well to other languages)
48
+ - May hallucinate factual content despite reasoning depth
49
+ - Inherits possible biases from synthetic pretraining data
50
+
51
+ ## Example Usage
52
+
53
+ ```python
54
+ # Use a pipeline as a high-level helper
55
+ from transformers import pipeline
56
+
57
+ messages = [
58
+ {"role": "user", "content": "Who are you?"},
59
+ ]
60
+ pipe = pipeline("text-generation", model="Daemontatox/Manticore-32B")
61
+ pipe(messages)
62
+ ```
63
+
64
+ ### Training Details
65
+ ## Framework: TRL + LoRA with Unsloth acceleration
66
+ ## Epochs/Steps: Custom fine-tuning on ~1M samples
67
+ ## Hardware: Single-node A100 80GB / similar high-VRAM setup
68
+ ## Objective: Enhance multi-domain reasoning under compute-efficient constraints