RekklesAI commited on
Commit
403025b
Β·
verified Β·
1 Parent(s): 2b4965f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +37 -46
README.md CHANGED
@@ -27,38 +27,20 @@ pipeline_tag: image-text-to-text
27
 
28
  LogicFlow-Gemma-3-27b-thinking is an advanced **multimodal reasoning model** built upon [google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it), specifically designed to excel at complex logical reasoning, mathematical problem-solving, and step-by-step analytical thinking. This model represents a significant advancement in AI reasoning capabilities, achieved through careful fine-tuning on three specialized, high-quality datasets using LoRA (Low-Rank Adaptation) technique.
29
 
30
- ### Training Dataset Foundation
31
-
32
- Our model has been meticulously trained on three cutting-edge datasets, each contributing unique reasoning capabilities:
33
-
34
- #### 🧠 **OpenO1-SFT Dataset**
35
- - **Purpose**: Supervised fine-tuning for advanced reasoning patterns
36
- - **Content**: High-quality reasoning demonstrations with explicit thought processes
37
- - **Impact**: Enables the model to break down complex problems systematically and show transparent reasoning chains
38
-
39
- #### πŸ’­ **Open-Thoughts Dataset**
40
- - **Purpose**: Step-by-step thinking process modeling
41
- - **Content**: Detailed internal monologues and reasoning progressions for various problem types
42
- - **Impact**: Teaches the model to externalize its thinking process, making reasoning transparent and verifiable
43
-
44
- #### πŸ”’ **OpenR1-Math Dataset**
45
- - **Purpose**: Mathematical reasoning and problem-solving specialization
46
- - **Content**: Comprehensive mathematical problems with detailed solution methodologies
47
- - **Impact**: Significantly enhances performance on mathematical reasoning tasks, from basic arithmetic to advanced competition-level problems
48
 
49
  ### Key Innovations
50
 
51
  This unique combination of datasets creates a model that not only provides correct answers but also demonstrates **how** it arrives at those answers, making it particularly valuable for educational applications, research, and any scenario requiring explainable AI reasoning.
52
 
53
  The model demonstrates enhanced capabilities in:
54
- - **🧠 Logical Reasoning**: Improved ability to work through complex logical problems step by step
55
- - **πŸ”’ Mathematical Problem Solving**: Enhanced performance on mathematical reasoning tasks (76.8% MATH, 13.3% AIME25)
56
- - **πŸ”¬ Scientific Analysis**: Exceptional scientific reasoning capabilities (45.96% GPQA Diamond)
57
- - **πŸ’­ Chain-of-Thought Reasoning**: Superior step-by-step thinking with detailed reasoning chains and self-verification
58
- - **πŸ“Š Structured Analysis**: Improved at breaking down complex problems into manageable components
59
- - **βœ… Multi-Method Verification**: Uses multiple approaches to validate results and ensure accuracy
60
- - **πŸ‘οΈ Vision Understanding**: Ability to analyze and reason about images, charts, diagrams, and visual data
61
- - **πŸ”„ Multimodal Reasoning**: Combining visual and textual information for comprehensive analysis
62
 
63
  ## Model Details
64
 
@@ -77,11 +59,20 @@ The model demonstrates enhanced capabilities in:
77
  ### Training Data
78
  The model was fine-tuned on three carefully selected, high-quality datasets that form the foundation of its exceptional reasoning capabilities:
79
 
80
- - **🧠 OpenO1-SFT**: Advanced supervised fine-tuning dataset containing high-quality reasoning demonstrations with explicit thought processes, enabling systematic problem breakdown and transparent reasoning chains
 
 
 
81
 
82
- - **πŸ’­ Open-Thoughts**: Specialized dataset focused on step-by-step thinking processes, featuring detailed internal monologues and reasoning progressions that teach the model to externalize and structure its thinking
 
 
 
83
 
84
- - **πŸ”’ OpenR1-Math**: Comprehensive mathematical reasoning dataset with detailed solution methodologies, significantly enhancing performance from basic arithmetic to advanced competition-level mathematical problems
 
 
 
85
 
86
  This synergistic combination creates a model that excels not only at providing accurate answers but also at demonstrating clear, verifiable reasoning processes.
87
 
@@ -156,20 +147,20 @@ The loss curve demonstrates stable convergence with the final training loss reac
156
 
157
  | **Benchmark** | **Metric** | **Base Gemma-3-27B-IT** | **LogicFlow-Gemma-3-27b-thinking** | **Improvement** |
158
  |---------------|------------|--------------------------|-------------------------------------|-----------------|
159
- | **πŸ“Š Mathematical Reasoning** |
160
  | GSM8K | Exact Match | 82.6% | **89.5%** | **+6.9%** |
161
  | MATH | Accuracy | 50.0% | **76.8%** | **+26.8%** |
162
- | **πŸ’» Code Generation** |
163
  | MBPP | pass@1 | 65.6% | **69.0%** | **+3.4%** |
164
  | HumanEval | 0-shot | 48.8% | *Pending* | *TBD* |
165
- | **🎯 Instruction Following** |
166
  | IFEval | Prompt-level | *45.0%* | **40.0%** | **-5.0%** |
167
  | IFEval | Instruction-level | *58.0%* | **53.1%** | **-4.9%** |
168
- | **πŸ† Advanced Mathematics** |
169
  | AIME25 | Problem Solving | ~8-12% | **13.3%** | **+1-5%** |
170
- | **πŸ”¬ Scientific Reasoning** |
171
  | GPQA Diamond | Science QA | ~30-35% | **45.96%** | **+11-16%** |
172
- | **🧠 Knowledge & Understanding** |
173
  | MMLU | Overall Accuracy | 78.6% | **75.3%** | **-3.3%** |
174
  | MMLU STEM | Sciences & Math | ~70.0% | **71.6%** | **+1.6%** |
175
  | MMLU Humanities | Arts & Literature | ~67.0% | **69.2%** | **+2.2%** |
@@ -178,7 +169,7 @@ The loss curve demonstrates stable convergence with the final training loss reac
178
 
179
  ### Key Performance Insights
180
 
181
- #### βœ… **Significant Improvements**
182
  - **Mathematical Reasoning**: Exceptional improvements - GSM8K (+6.9%) and MATH (+26.8%) demonstrate enhanced step-by-step problem solving
183
  - **Advanced Mathematics**: Massive 26.8% improvement on MATH benchmark showcases superior mathematical reasoning capabilities
184
  - **Scientific Reasoning**: Outstanding 45.96% accuracy on GPQA Diamond - significantly above typical model performance (30-35%)
@@ -186,12 +177,12 @@ The loss curve demonstrates stable convergence with the final training loss reac
186
  - **Code Generation**: 3.4% improvement on MBPP shows better programming logic understanding
187
  - **Domain-Specific Knowledge**: Improvements in STEM (+1.6%), Humanities (+2.2%), and Social Sciences (+2.3%)
188
 
189
- #### ⚠️ **Trade-offs Observed**
190
  - **Instruction Following**: Slight decrease in IFEval scores (-5% prompt-level, -4.9% instruction-level)
191
  - **General Knowledge**: Overall MMLU score decreased by 3.3% due to reasoning specialization
192
  - **Reasoning Focus**: Model optimized for deep analytical thinking over rapid instruction compliance
193
 
194
- #### 🎯 **Specialized Capabilities**
195
  - **Mathematical Excellence**: Outstanding 76.8% accuracy on MATH benchmark - among the top performances for 27B models
196
  - **Scientific Reasoning**: Exceptional 45.96% on GPQA Diamond - handling graduate-level physics, chemistry, and biology problems
197
  - **Elite Competition Performance**: Competitive 13.3% on AIME25 - tackling American Invitational Mathematics Exam challenges
@@ -430,10 +421,10 @@ The model showcases systematic thinking through:
430
  - Clear documentation of the reasoning process
431
 
432
  These examples demonstrate the model's ability to:
433
- - **πŸ” Break down complex problems** into manageable steps
434
- - **βœ… Self-verify results** using multiple approaches
435
- - **πŸ“ Document reasoning chains** for transparency
436
- - **🎯 Maintain accuracy** while showing work
437
 
438
  ### Activating Chain-of-Thought Reasoning
439
 
@@ -472,26 +463,26 @@ Show your reasoning process before giving the final answer."""
472
 
473
  This multimodal model is particularly well-suited for:
474
 
475
- ### πŸ“š Educational Applications
476
  - **Chain-of-Thought Tutoring**: Demonstrates complete problem-solving processes with transparent reasoning steps
477
  - **Mathematical Education**: Shows multiple verification methods for mathematical concepts (as seen in 9.11 vs 9.9 example)
478
  - **Critical Thinking Development**: Models systematic analysis and self-verification techniques
479
  - **Visual Learning**: Analyzing educational diagrams, charts, and mathematical illustrations
480
  - **Interactive Learning**: Combining text and visual elements for comprehensive understanding
481
 
482
- ### πŸ”’ Mathematical & Scientific Analysis
483
  - **Chart Analysis**: Interpreting graphs, statistical charts, and data visualizations
484
  - **Geometric Problem Solving**: Analyzing geometric figures and spatial relationships
485
  - **Scientific Diagram Understanding**: Processing scientific illustrations and technical drawings
486
  - **Formula Recognition**: Understanding mathematical formulas in images
487
 
488
- ### πŸ’Ό Professional Applications
489
  - **Document Analysis**: Processing documents containing both text and visual elements
490
  - **Technical Documentation**: Understanding technical manuals with diagrams
491
  - **Data Visualization**: Analyzing and explaining complex charts and infographics
492
  - **Research Assistance**: Combining textual research with visual data analysis
493
 
494
- ### 🧠 Advanced Reasoning Tasks
495
  - **Chain-of-Thought Problem Solving**: Complex reasoning with detailed step-by-step analysis and self-verification
496
  - **Multi-Method Validation**: Using multiple approaches to verify answers (numerical comparison, pattern analysis, etc.)
497
  - **Transparent Decision Making**: Showing complete reasoning chains for critical analysis tasks
 
27
 
28
  LogicFlow-Gemma-3-27b-thinking is an advanced **multimodal reasoning model** built upon [google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it), specifically designed to excel at complex logical reasoning, mathematical problem-solving, and step-by-step analytical thinking. This model represents a significant advancement in AI reasoning capabilities, achieved through careful fine-tuning on three specialized, high-quality datasets using LoRA (Low-Rank Adaptation) technique.
29
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
  ### Key Innovations
32
 
33
  This unique combination of datasets creates a model that not only provides correct answers but also demonstrates **how** it arrives at those answers, making it particularly valuable for educational applications, research, and any scenario requiring explainable AI reasoning.
34
 
35
  The model demonstrates enhanced capabilities in:
36
+ - ** Logical Reasoning**: Improved ability to work through complex logical problems step by step
37
+ - ** Mathematical Problem Solving**: Enhanced performance on mathematical reasoning tasks (76.8% MATH, 13.3% AIME25)
38
+ - ** Scientific Analysis**: Exceptional scientific reasoning capabilities (45.96% GPQA Diamond)
39
+ - ** Chain-of-Thought Reasoning**: Superior step-by-step thinking with detailed reasoning chains and self-verification
40
+ - ** Structured Analysis**: Improved at breaking down complex problems into manageable components
41
+ - ** Multi-Method Verification**: Uses multiple approaches to validate results and ensure accuracy
42
+ - ** Vision Understanding**: Ability to analyze and reason about images, charts, diagrams, and visual data
43
+ - ** Multimodal Reasoning**: Combining visual and textual information for comprehensive analysis
44
 
45
  ## Model Details
46
 
 
59
  ### Training Data
60
  The model was fine-tuned on three carefully selected, high-quality datasets that form the foundation of its exceptional reasoning capabilities:
61
 
62
+ #### **OpenO1-SFT Dataset**
63
+ - **Purpose**: Supervised fine-tuning for advanced reasoning patterns
64
+ - **Content**: High-quality reasoning demonstrations with explicit thought processes
65
+ - **Impact**: Enables the model to break down complex problems systematically and show transparent reasoning chains
66
 
67
+ #### **Open-Thoughts Dataset**
68
+ - **Purpose**: Step-by-step thinking process modeling
69
+ - **Content**: Detailed internal monologues and reasoning progressions for various problem types
70
+ - **Impact**: Teaches the model to externalize its thinking process, making reasoning transparent and verifiable
71
 
72
+ #### **OpenR1-Math Dataset**
73
+ - **Purpose**: Mathematical reasoning and problem-solving specialization
74
+ - **Content**: Comprehensive mathematical problems with detailed solution methodologies
75
+ - **Impact**: Significantly enhances performance on mathematical reasoning tasks, from basic arithmetic to advanced competition-level problems
76
 
77
  This synergistic combination creates a model that excels not only at providing accurate answers but also at demonstrating clear, verifiable reasoning processes.
78
 
 
147
 
148
  | **Benchmark** | **Metric** | **Base Gemma-3-27B-IT** | **LogicFlow-Gemma-3-27b-thinking** | **Improvement** |
149
  |---------------|------------|--------------------------|-------------------------------------|-----------------|
150
+ | ** Mathematical Reasoning** |
151
  | GSM8K | Exact Match | 82.6% | **89.5%** | **+6.9%** |
152
  | MATH | Accuracy | 50.0% | **76.8%** | **+26.8%** |
153
+ | ** Code Generation** |
154
  | MBPP | pass@1 | 65.6% | **69.0%** | **+3.4%** |
155
  | HumanEval | 0-shot | 48.8% | *Pending* | *TBD* |
156
+ | ** Instruction Following** |
157
  | IFEval | Prompt-level | *45.0%* | **40.0%** | **-5.0%** |
158
  | IFEval | Instruction-level | *58.0%* | **53.1%** | **-4.9%** |
159
+ | ** Advanced Mathematics** |
160
  | AIME25 | Problem Solving | ~8-12% | **13.3%** | **+1-5%** |
161
+ | ** Scientific Reasoning** |
162
  | GPQA Diamond | Science QA | ~30-35% | **45.96%** | **+11-16%** |
163
+ | ** Knowledge & Understanding** |
164
  | MMLU | Overall Accuracy | 78.6% | **75.3%** | **-3.3%** |
165
  | MMLU STEM | Sciences & Math | ~70.0% | **71.6%** | **+1.6%** |
166
  | MMLU Humanities | Arts & Literature | ~67.0% | **69.2%** | **+2.2%** |
 
169
 
170
  ### Key Performance Insights
171
 
172
+ #### **Significant Improvements**
173
  - **Mathematical Reasoning**: Exceptional improvements - GSM8K (+6.9%) and MATH (+26.8%) demonstrate enhanced step-by-step problem solving
174
  - **Advanced Mathematics**: Massive 26.8% improvement on MATH benchmark showcases superior mathematical reasoning capabilities
175
  - **Scientific Reasoning**: Outstanding 45.96% accuracy on GPQA Diamond - significantly above typical model performance (30-35%)
 
177
  - **Code Generation**: 3.4% improvement on MBPP shows better programming logic understanding
178
  - **Domain-Specific Knowledge**: Improvements in STEM (+1.6%), Humanities (+2.2%), and Social Sciences (+2.3%)
179
 
180
+ #### **Trade-offs Observed**
181
  - **Instruction Following**: Slight decrease in IFEval scores (-5% prompt-level, -4.9% instruction-level)
182
  - **General Knowledge**: Overall MMLU score decreased by 3.3% due to reasoning specialization
183
  - **Reasoning Focus**: Model optimized for deep analytical thinking over rapid instruction compliance
184
 
185
+ #### **Specialized Capabilities**
186
  - **Mathematical Excellence**: Outstanding 76.8% accuracy on MATH benchmark - among the top performances for 27B models
187
  - **Scientific Reasoning**: Exceptional 45.96% on GPQA Diamond - handling graduate-level physics, chemistry, and biology problems
188
  - **Elite Competition Performance**: Competitive 13.3% on AIME25 - tackling American Invitational Mathematics Exam challenges
 
421
  - Clear documentation of the reasoning process
422
 
423
  These examples demonstrate the model's ability to:
424
+ - ** Break down complex problems** into manageable steps
425
+ - ** Self-verify results** using multiple approaches
426
+ - ** Document reasoning chains** for transparency
427
+ - ** Maintain accuracy** while showing work
428
 
429
  ### Activating Chain-of-Thought Reasoning
430
 
 
463
 
464
  This multimodal model is particularly well-suited for:
465
 
466
+ ### Educational Applications
467
  - **Chain-of-Thought Tutoring**: Demonstrates complete problem-solving processes with transparent reasoning steps
468
  - **Mathematical Education**: Shows multiple verification methods for mathematical concepts (as seen in 9.11 vs 9.9 example)
469
  - **Critical Thinking Development**: Models systematic analysis and self-verification techniques
470
  - **Visual Learning**: Analyzing educational diagrams, charts, and mathematical illustrations
471
  - **Interactive Learning**: Combining text and visual elements for comprehensive understanding
472
 
473
+ ### Mathematical & Scientific Analysis
474
  - **Chart Analysis**: Interpreting graphs, statistical charts, and data visualizations
475
  - **Geometric Problem Solving**: Analyzing geometric figures and spatial relationships
476
  - **Scientific Diagram Understanding**: Processing scientific illustrations and technical drawings
477
  - **Formula Recognition**: Understanding mathematical formulas in images
478
 
479
+ ### Professional Applications
480
  - **Document Analysis**: Processing documents containing both text and visual elements
481
  - **Technical Documentation**: Understanding technical manuals with diagrams
482
  - **Data Visualization**: Analyzing and explaining complex charts and infographics
483
  - **Research Assistance**: Combining textual research with visual data analysis
484
 
485
+ ### Advanced Reasoning Tasks
486
  - **Chain-of-Thought Problem Solving**: Complex reasoning with detailed step-by-step analysis and self-verification
487
  - **Multi-Method Validation**: Using multiple approaches to verify answers (numerical comparison, pattern analysis, etc.)
488
  - **Transparent Decision Making**: Showing complete reasoning chains for critical analysis tasks