Update README.md
Browse files
README.md
CHANGED
@@ -18,10 +18,10 @@ datasets:
|
|
18 |
|
19 |
# LogicFlow-Llama-3B
|
20 |
|
21 |
-
🚀 **Introducing LogicFlow-Llama-3B: Exploring Open Access to Chain-of-Thought Reasoning**
|
22 |
-
|
23 |

|
24 |
|
|
|
|
|
25 |
Ever wished your AI could not just *tell* you the answer, but *show* you its thinking? **LogicFlow-Llama-3B** represents an exciting attempt to instill robust Chain-of-Thought (CoT) capabilities into models like `meta-llama/Llama-3.2-3B-Instruct`, which, in its base form, does not possess strong inherent CoT reasoning. This isn't just another fine-tune; it's a meticulously crafted model designed to explore the potential of CoT on accessible hardware.
|
26 |
|
27 |
Leveraging the insightful `open-thoughts/OpenThoughts-114k` dataset and the versatile [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) training library, LogicFlow-Llama-3B has been trained to dissect intricate problems and articulate its reasoning process step-by-step. Remarkably, this entire fine-tuning process was accomplished **on a single GPU**, demonstrating a pathway to more accessible CoT model development. Get ready to explore the frontiers of logical AI and unlock a new era of AI-powered deep thinking, even with limited resources!
|
@@ -34,7 +34,7 @@ Leveraging the insightful `open-thoughts/OpenThoughts-114k` dataset and the vers
|
|
34 |
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
|
35 |
- **Fine-tuning Library:** LLaMA-Factory
|
36 |
- **Dataset:** `open-thoughts/OpenThoughts-114k` (for Chain-of-Thought enhancement)
|
37 |
-
- **Training Hardware:** Single GPU
|
38 |
- **LoRA Rank:** 8
|
39 |
- **LoRA Alpha:** 16
|
40 |
- **LoRA Dropout:** 0
|
@@ -80,7 +80,8 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
80 |
|
81 |
## Training Process
|
82 |
|
83 |
-
The model was fine-tuned for 3.0 epochs over a total of
|
|
|
84 |
|
85 |
Here's a glimpse into the training progression:
|
86 |
|
@@ -94,6 +95,18 @@ Below is a visualization of the training loss curve:
|
|
94 |
|
95 |

|
96 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
97 |
### Training Configuration (from `llamaboard_config.yaml`):
|
98 |
|
99 |
```yaml
|
|
|
18 |
|
19 |
# LogicFlow-Llama-3B
|
20 |
|
|
|
|
|
21 |

|
22 |
|
23 |
+
🚀 **Introducing LogicFlow-Llama-3B: Exploring Open Access to Chain-of-Thought Reasoning**
|
24 |
+
|
25 |
Ever wished your AI could not just *tell* you the answer, but *show* you its thinking? **LogicFlow-Llama-3B** represents an exciting attempt to instill robust Chain-of-Thought (CoT) capabilities into models like `meta-llama/Llama-3.2-3B-Instruct`, which, in its base form, does not possess strong inherent CoT reasoning. This isn't just another fine-tune; it's a meticulously crafted model designed to explore the potential of CoT on accessible hardware.
|
26 |
|
27 |
Leveraging the insightful `open-thoughts/OpenThoughts-114k` dataset and the versatile [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) training library, LogicFlow-Llama-3B has been trained to dissect intricate problems and articulate its reasoning process step-by-step. Remarkably, this entire fine-tuning process was accomplished **on a single GPU**, demonstrating a pathway to more accessible CoT model development. Get ready to explore the frontiers of logical AI and unlock a new era of AI-powered deep thinking, even with limited resources!
|
|
|
34 |
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
|
35 |
- **Fine-tuning Library:** LLaMA-Factory
|
36 |
- **Dataset:** `open-thoughts/OpenThoughts-114k` (for Chain-of-Thought enhancement)
|
37 |
+
- **Training Hardware:** Single GPU-A6000
|
38 |
- **LoRA Rank:** 8
|
39 |
- **LoRA Alpha:** 16
|
40 |
- **LoRA Dropout:** 0
|
|
|
80 |
|
81 |
## Training Process
|
82 |
|
83 |
+
The model was fine-tuned for **3.0 epochs** over a total of **18,750 steps** on a single **A6000 GPU**. Training employed a **linear learning rate scheduler**, starting from an initial rate of **5e-5**, with gradual decay toward zero. The process leveraged **LoRA** with `bf16` precision and **FlashAttention2** for efficient memory use and speed.
|
84 |
+
|
85 |
|
86 |
Here's a glimpse into the training progression:
|
87 |
|
|
|
95 |
|
96 |

|
97 |
|
98 |
+
### 📊 Final Training Metrics
|
99 |
+
|
100 |
+
| Metric | Value |
|
101 |
+
|----------------------------|-----------------------------|
|
102 |
+
| **Epochs** | 3.0 |
|
103 |
+
| **Input Tokens Seen** | 613,609,008 |
|
104 |
+
| **Total FLOPs** | 9,706,625,883 GFLOPs |
|
105 |
+
| **Final Train Loss** | 0.435 |
|
106 |
+
| **Total Runtime** | 1 day, 22 hours, 12 minutes |
|
107 |
+
| **Samples per Second** | 1.803 |
|
108 |
+
| **Steps per Second** | 0.113 |
|
109 |
+
|
110 |
### Training Configuration (from `llamaboard_config.yaml`):
|
111 |
|
112 |
```yaml
|