RekklesAI commited on
Commit
a763a3d
·
verified ·
1 Parent(s): 47acf98

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -149,8 +149,8 @@ The loss curve demonstrates stable convergence with the final training loss reac
149
  | **Benchmark** | **Metric** | **Base Gemma-3-27B-IT** | **LogicFlow-Gemma-3-27b-thinking** | **Improvement** |
150
  |---------------|------------|--------------------------|-------------------------------------|-----------------|
151
  | **Mathematical Reasoning** |
152
- | GSM8K | Exact Match | 82.6% | **89.5%** | **+6.9%** |
153
- | MATH | Accuracy | 50.0% | **76.8%** | **+26.8%** |
154
  | **Code Generation** |
155
  | MBPP | pass@1 | 65.6% | **69.0%** | **+3.4%** |
156
  | HumanEval | 0-shot | 48.8% | *Pending* | *TBD* |
@@ -158,9 +158,9 @@ The loss curve demonstrates stable convergence with the final training loss reac
158
  | IFEval | Prompt-level | *45.0%* | **40.0%** | **-5.0%** |
159
  | IFEval | Instruction-level | *58.0%* | **53.1%** | **-4.9%** |
160
  | **Advanced Mathematics** |
161
- | AIME25 | Problem Solving | ~8-12% | **13.3%** | **+1-5%** |
162
  | **Scientific Reasoning** |
163
- | GPQA Diamond | Science QA | ~30-35% | **45.96%** | **+11-16%** |
164
  | **Knowledge & Understanding** |
165
  | MMLU | Overall Accuracy | 78.6% | **75.3%** | **-3.3%** |
166
  | MMLU STEM | Sciences & Math | ~70.0% | **71.6%** | **+1.6%** |
 
149
  | **Benchmark** | **Metric** | **Base Gemma-3-27B-IT** | **LogicFlow-Gemma-3-27b-thinking** | **Improvement** |
150
  |---------------|------------|--------------------------|-------------------------------------|-----------------|
151
  | **Mathematical Reasoning** |
152
+ | GSM8K | 5-shot | 82.6% | **89.5%** | **+6.9%** |
153
+ | MATH | 5-shot | 50.0% | **76.8%** | **+26.8%** |
154
  | **Code Generation** |
155
  | MBPP | pass@1 | 65.6% | **69.0%** | **+3.4%** |
156
  | HumanEval | 0-shot | 48.8% | *Pending* | *TBD* |
 
158
  | IFEval | Prompt-level | *45.0%* | **40.0%** | **-5.0%** |
159
  | IFEval | Instruction-level | *58.0%* | **53.1%** | **-4.9%** |
160
  | **Advanced Mathematics** |
161
+ | AIME25 | 5-shot | ~8-12% | **13.3%** | **+1-5%** |
162
  | **Scientific Reasoning** |
163
+ | GPQA Diamond | 5-shot | ~30-35% | **45.96%** | **+11-16%** |
164
  | **Knowledge & Understanding** |
165
  | MMLU | Overall Accuracy | 78.6% | **75.3%** | **-3.3%** |
166
  | MMLU STEM | Sciences & Math | ~70.0% | **71.6%** | **+1.6%** |