RekklesAI commited on
Commit
d8775c2
·
verified ·
1 Parent(s): 754afd2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -4
README.md CHANGED
@@ -18,10 +18,10 @@ datasets:
18
 
19
  # LogicFlow-Llama-3B
20
 
21
- 🚀 **Introducing LogicFlow-Llama-3B: Exploring Open Access to Chain-of-Thought Reasoning**
22
-
23
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/664589a52d210101d1eac6ad/l_vPNI8K1AbiHHXUTo6aa.png)
24
 
 
 
25
  Ever wished your AI could not just *tell* you the answer, but *show* you its thinking? **LogicFlow-Llama-3B** represents an exciting attempt to instill robust Chain-of-Thought (CoT) capabilities into models like `meta-llama/Llama-3.2-3B-Instruct`, which, in its base form, does not possess strong inherent CoT reasoning. This isn't just another fine-tune; it's a meticulously crafted model designed to explore the potential of CoT on accessible hardware.
26
 
27
  Leveraging the insightful `open-thoughts/OpenThoughts-114k` dataset and the versatile [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) training library, LogicFlow-Llama-3B has been trained to dissect intricate problems and articulate its reasoning process step-by-step. Remarkably, this entire fine-tuning process was accomplished **on a single GPU**, demonstrating a pathway to more accessible CoT model development. Get ready to explore the frontiers of logical AI and unlock a new era of AI-powered deep thinking, even with limited resources!
@@ -34,7 +34,7 @@ Leveraging the insightful `open-thoughts/OpenThoughts-114k` dataset and the vers
34
  - **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
35
  - **Fine-tuning Library:** LLaMA-Factory
36
  - **Dataset:** `open-thoughts/OpenThoughts-114k` (for Chain-of-Thought enhancement)
37
- - **Training Hardware:** Single GPU
38
  - **LoRA Rank:** 8
39
  - **LoRA Alpha:** 16
40
  - **LoRA Dropout:** 0
@@ -80,7 +80,8 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
80
 
81
  ## Training Process
82
 
83
- The model was fine-tuned for 3.0 epochs over a total of 18750 steps on a single GPU. The training was conducted using a linear learning rate scheduler, starting from an initial learning rate of 5e-5.
 
84
 
85
  Here's a glimpse into the training progression:
86
 
@@ -94,6 +95,18 @@ Below is a visualization of the training loss curve:
94
 
95
  ![Training Loss](training_loss.png)
96
 
 
 
 
 
 
 
 
 
 
 
 
 
97
  ### Training Configuration (from `llamaboard_config.yaml`):
98
 
99
  ```yaml
 
18
 
19
  # LogicFlow-Llama-3B
20
 
 
 
21
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/664589a52d210101d1eac6ad/l_vPNI8K1AbiHHXUTo6aa.png)
22
 
23
+ 🚀 **Introducing LogicFlow-Llama-3B: Exploring Open Access to Chain-of-Thought Reasoning**
24
+
25
  Ever wished your AI could not just *tell* you the answer, but *show* you its thinking? **LogicFlow-Llama-3B** represents an exciting attempt to instill robust Chain-of-Thought (CoT) capabilities into models like `meta-llama/Llama-3.2-3B-Instruct`, which, in its base form, does not possess strong inherent CoT reasoning. This isn't just another fine-tune; it's a meticulously crafted model designed to explore the potential of CoT on accessible hardware.
26
 
27
  Leveraging the insightful `open-thoughts/OpenThoughts-114k` dataset and the versatile [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) training library, LogicFlow-Llama-3B has been trained to dissect intricate problems and articulate its reasoning process step-by-step. Remarkably, this entire fine-tuning process was accomplished **on a single GPU**, demonstrating a pathway to more accessible CoT model development. Get ready to explore the frontiers of logical AI and unlock a new era of AI-powered deep thinking, even with limited resources!
 
34
  - **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
35
  - **Fine-tuning Library:** LLaMA-Factory
36
  - **Dataset:** `open-thoughts/OpenThoughts-114k` (for Chain-of-Thought enhancement)
37
+ - **Training Hardware:** Single GPU-A6000
38
  - **LoRA Rank:** 8
39
  - **LoRA Alpha:** 16
40
  - **LoRA Dropout:** 0
 
80
 
81
  ## Training Process
82
 
83
+ The model was fine-tuned for **3.0 epochs** over a total of **18,750 steps** on a single **A6000 GPU**. Training employed a **linear learning rate scheduler**, starting from an initial rate of **5e-5**, with gradual decay toward zero. The process leveraged **LoRA** with `bf16` precision and **FlashAttention2** for efficient memory use and speed.
84
+
85
 
86
  Here's a glimpse into the training progression:
87
 
 
95
 
96
  ![Training Loss](training_loss.png)
97
 
98
+ ### 📊 Final Training Metrics
99
+
100
+ | Metric | Value |
101
+ |----------------------------|-----------------------------|
102
+ | **Epochs** | 3.0 |
103
+ | **Input Tokens Seen** | 613,609,008 |
104
+ | **Total FLOPs** | 9,706,625,883 GFLOPs |
105
+ | **Final Train Loss** | 0.435 |
106
+ | **Total Runtime** | 1 day, 22 hours, 12 minutes |
107
+ | **Samples per Second** | 1.803 |
108
+ | **Steps per Second** | 0.113 |
109
+
110
  ### Training Configuration (from `llamaboard_config.yaml`):
111
 
112
  ```yaml