RekklesAI commited on
Commit
47acf98
·
verified ·
1 Parent(s): 538ee31

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -148,20 +148,20 @@ The loss curve demonstrates stable convergence with the final training loss reac
148
 
149
  | **Benchmark** | **Metric** | **Base Gemma-3-27B-IT** | **LogicFlow-Gemma-3-27b-thinking** | **Improvement** |
150
  |---------------|------------|--------------------------|-------------------------------------|-----------------|
151
- | ** Mathematical Reasoning** |
152
  | GSM8K | Exact Match | 82.6% | **89.5%** | **+6.9%** |
153
  | MATH | Accuracy | 50.0% | **76.8%** | **+26.8%** |
154
- | ** Code Generation** |
155
  | MBPP | pass@1 | 65.6% | **69.0%** | **+3.4%** |
156
  | HumanEval | 0-shot | 48.8% | *Pending* | *TBD* |
157
- | ** Instruction Following** |
158
  | IFEval | Prompt-level | *45.0%* | **40.0%** | **-5.0%** |
159
  | IFEval | Instruction-level | *58.0%* | **53.1%** | **-4.9%** |
160
- | ** Advanced Mathematics** |
161
  | AIME25 | Problem Solving | ~8-12% | **13.3%** | **+1-5%** |
162
- | ** Scientific Reasoning** |
163
  | GPQA Diamond | Science QA | ~30-35% | **45.96%** | **+11-16%** |
164
- | ** Knowledge & Understanding** |
165
  | MMLU | Overall Accuracy | 78.6% | **75.3%** | **-3.3%** |
166
  | MMLU STEM | Sciences & Math | ~70.0% | **71.6%** | **+1.6%** |
167
  | MMLU Humanities | Arts & Literature | ~67.0% | **69.2%** | **+2.2%** |
@@ -422,10 +422,10 @@ The model showcases systematic thinking through:
422
  - Clear documentation of the reasoning process
423
 
424
  These examples demonstrate the model's ability to:
425
- - ** Break down complex problems** into manageable steps
426
- - ** Self-verify results** using multiple approaches
427
- - ** Document reasoning chains** for transparency
428
- - ** Maintain accuracy** while showing work
429
 
430
  ### Activating Chain-of-Thought Reasoning
431
 
 
148
 
149
  | **Benchmark** | **Metric** | **Base Gemma-3-27B-IT** | **LogicFlow-Gemma-3-27b-thinking** | **Improvement** |
150
  |---------------|------------|--------------------------|-------------------------------------|-----------------|
151
+ | **Mathematical Reasoning** |
152
  | GSM8K | Exact Match | 82.6% | **89.5%** | **+6.9%** |
153
  | MATH | Accuracy | 50.0% | **76.8%** | **+26.8%** |
154
+ | **Code Generation** |
155
  | MBPP | pass@1 | 65.6% | **69.0%** | **+3.4%** |
156
  | HumanEval | 0-shot | 48.8% | *Pending* | *TBD* |
157
+ | **Instruction Following** |
158
  | IFEval | Prompt-level | *45.0%* | **40.0%** | **-5.0%** |
159
  | IFEval | Instruction-level | *58.0%* | **53.1%** | **-4.9%** |
160
+ | **Advanced Mathematics** |
161
  | AIME25 | Problem Solving | ~8-12% | **13.3%** | **+1-5%** |
162
+ | **Scientific Reasoning** |
163
  | GPQA Diamond | Science QA | ~30-35% | **45.96%** | **+11-16%** |
164
+ | **Knowledge & Understanding** |
165
  | MMLU | Overall Accuracy | 78.6% | **75.3%** | **-3.3%** |
166
  | MMLU STEM | Sciences & Math | ~70.0% | **71.6%** | **+1.6%** |
167
  | MMLU Humanities | Arts & Literature | ~67.0% | **69.2%** | **+2.2%** |
 
422
  - Clear documentation of the reasoning process
423
 
424
  These examples demonstrate the model's ability to:
425
+ - **Break down complex problems** into manageable steps
426
+ - **Self-verify results** using multiple approaches
427
+ - **Document reasoning chains** for transparency
428
+ - **Maintain accuracy** while showing work
429
 
430
  ### Activating Chain-of-Thought Reasoning
431