RekklesAI commited on Jul 23

Commit

7c99a96

verified ·

1 Parent(s): 5086c6b

Upload folder using huggingface_hub

Browse files

Files changed (26) hide show

.gitattributes +3 -34
CoT_example_1.png +3 -0
CoT_example_2.png +0 -0
README.md +598 -0
added_tokens.json +3 -0
chat_template.jinja +19 -0
config.json +64 -0
generation_config.json +13 -0
model-00001-of-00012.safetensors +3 -0
model-00002-of-00012.safetensors +3 -0
model-00003-of-00012.safetensors +3 -0
model-00004-of-00012.safetensors +3 -0
model-00005-of-00012.safetensors +3 -0
model-00006-of-00012.safetensors +3 -0
model-00007-of-00012.safetensors +3 -0
model-00008-of-00012.safetensors +3 -0
model-00009-of-00012.safetensors +3 -0
model-00010-of-00012.safetensors +3 -0
model-00011-of-00012.safetensors +3 -0
model-00012-of-00012.safetensors +3 -0
model.safetensors.index.json +0 -0
preprocessor_config.json +29 -0
special_tokens_map.json +33 -0
tokenizer.json +3 -0
tokenizer_config.json +0 -0
training_loss.png +0 -0

.gitattributes CHANGED Viewed

@@ -1,35 +1,4 @@
-*.7z filter=lfs diff=lfs merge=lfs -text
-*.arrow filter=lfs diff=lfs merge=lfs -text
-*.bin filter=lfs diff=lfs merge=lfs -text
-*.bz2 filter=lfs diff=lfs merge=lfs -text
-*.ckpt filter=lfs diff=lfs merge=lfs -text
-*.ftz filter=lfs diff=lfs merge=lfs -text
-*.gz filter=lfs diff=lfs merge=lfs -text
-*.h5 filter=lfs diff=lfs merge=lfs -text
-*.joblib filter=lfs diff=lfs merge=lfs -text
-*.lfs.* filter=lfs diff=lfs merge=lfs -text
-*.mlmodel filter=lfs diff=lfs merge=lfs -text
-*.model filter=lfs diff=lfs merge=lfs -text
-*.msgpack filter=lfs diff=lfs merge=lfs -text
-*.npy filter=lfs diff=lfs merge=lfs -text
-*.npz filter=lfs diff=lfs merge=lfs -text
-*.onnx filter=lfs diff=lfs merge=lfs -text
-*.ot filter=lfs diff=lfs merge=lfs -text
-*.parquet filter=lfs diff=lfs merge=lfs -text
-*.pb filter=lfs diff=lfs merge=lfs -text
-*.pickle filter=lfs diff=lfs merge=lfs -text
-*.pkl filter=lfs diff=lfs merge=lfs -text
-*.pt filter=lfs diff=lfs merge=lfs -text
-*.pth filter=lfs diff=lfs merge=lfs -text
-*.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
-saved_model/**/* filter=lfs diff=lfs merge=lfs -text
-*.tar.* filter=lfs diff=lfs merge=lfs -text
-*.tar filter=lfs diff=lfs merge=lfs -text
-*.tflite filter=lfs diff=lfs merge=lfs -text
-*.tgz filter=lfs diff=lfs merge=lfs -text
-*.wasm filter=lfs diff=lfs merge=lfs -text
-*.xz filter=lfs diff=lfs merge=lfs -text
-*.zip filter=lfs diff=lfs merge=lfs -text
-*.zst filter=lfs diff=lfs merge=lfs -text
-*tfevents* filter=lfs diff=lfs merge=lfs -text

 *.safetensors filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text
+*.json filter=lfs diff=lfs merge=lfs -text
+*.jinja filter=lfs diff=lfs merge=lfs -text

CoT_example_1.png ADDED Viewed

Git LFS Details

SHA256: 06aa395dc7584974814c89322ab95eb21528a2d4cab018df85c3b18a09121b6b
Pointer size: 131 Bytes
Size of remote file: 113 kB

CoT_example_2.png ADDED Viewed

README.md ADDED Viewed

	@@ -0,0 +1,598 @@

+---
+license: other
+base_model: google/gemma-3-27b-it
+tags:
+- llama-factory
+- lora
+- reasoning
+- thinking
+- mathematics
+- merged
+- multimodal
+- vision
+- image-text-to-text
+- visual-reasoning
+language:
+- en
+pipeline_tag: image-text-to-text
+---
+# LogicFlow-gemma-3-27b-thinking
+## Model Description
+LogicFlow-gemma-3-27b-thinking is a fine-tuned **multimodal** version of [google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it) that has been specifically optimized for logical reasoning, step-by-step thinking, and mathematical problem-solving with both text and image inputs. This model has been trained using LoRA (Low-Rank Adaptation) technique and then merged with the base model for optimal performance.
+The model demonstrates enhanced capabilities in:
+- **🧠 Logical Reasoning**: Improved ability to work through complex logical problems step by step
+- **🔢 Mathematical Problem Solving**: Enhanced performance on mathematical reasoning tasks (76.8% MATH, 13.3% AIME25)
+- **🔬 Scientific Analysis**: Exceptional scientific reasoning capabilities (45.96% GPQA Diamond)
+- **💭 Chain-of-Thought Reasoning**: Superior step-by-step thinking with detailed reasoning chains and self-verification
+- **📊 Structured Analysis**: Improved at breaking down complex problems into manageable components
+- **✅ Multi-Method Verification**: Uses multiple approaches to validate results and ensure accuracy
+- **👁️ Vision Understanding**: Ability to analyze and reason about images, charts, diagrams, and visual data
+- **🔄 Multimodal Reasoning**: Combining visual and textual information for comprehensive analysis
+## Model Details
+- **Model Type**: Multimodal Language Model (Gemma-3 Architecture)
+- **Base Model**: google/gemma-3-27b-it
+- **Parameters**: 27 billion parameters
+- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) with merge
+- **Context Length**: 131,072 tokens
+- **Architecture**: Gemma-3 with vision capabilities
+- **Precision**: bfloat16
+- **Image Resolution**: 896x896 pixels, encoded to 256 tokens per image
+- **Supported Formats**: Text + Images (JPEG, PNG, WebP)
+## Training Details
+### Training Data
+The model was fine-tuned on a combination of high-quality datasets:
+- **openo1_sft**: Supervised fine-tuning data for reasoning
+- **open_thoughts**: Dataset focused on step-by-step thinking processes
+- **open_r1_math**: Mathematical reasoning and problem-solving dataset
+### Training Configuration
+#### Core Training Parameters
+- **Learning Rate**: 5e-05
+- **Epochs**: 5.0
+- **Optimizer**: AdamW (adamw_torch)
+- **LR Scheduler**: Cosine with 100 warmup steps
+- **Max Gradient Norm**: 1.0
+- **Max Samples**: 100,000
+- **Precision**: bfloat16 (bf16: true)
+#### Batch Configuration
+- **Per Device Train Batch Size**: 2
+- **Gradient Accumulation Steps**: 8
+- **Total Effective Batch Size**: 32
+- **Packing**: Disabled (false)
+#### LoRA Configuration
+- **Fine-tuning Type**: LoRA
+- **LoRA Rank (r)**: 8
+- **LoRA Alpha**: 16
+- **LoRA Dropout**: 0.0
+- **LoRA Target**: all (comprehensive layer targeting)
+#### Sequence and Vision Parameters
+- **Cutoff Length**: 2,048 tokens
+- **Image Max Pixels**: 589,824
+- **Image Min Pixels**: 1,024
+- **Video Max Pixels**: 65,536
+- **Video Min Pixels**: 256
+- **Flash Attention**: auto
+- **Freeze Vision Tower**: true
+- **Freeze Multi-modal Projector**: true
+#### Special Features
+- **Enable Thinking**: true (enhanced reasoning capability)
+- **Template**: gemma
+- **Trust Remote Code**: true
+- **Preprocessing Workers**: 16
+- **Save Steps**: 100
+- **Logging Steps**: 5
+### Training Results
+#### Performance Metrics
+- **Final Training Loss**: 0.003759
+- **Training Runtime**: 8,446.67 seconds (~2.35 hours)
+- **Training Samples per Second**: 156.929
+- **Training Steps per Second**: 4.904
+- **Total Training Steps**: 41,400
+- **Completed Epochs**: 4.999924559047633
+#### Resource Utilization
+- **Total Input Tokens Seen**: 2,531,530,240 tokens
+- **Total FLOPs**: 3.96 × 10²⁰
+- **DDP Timeout**: 180,000,000 seconds
+- **Plot Loss**: Enabled (training loss visualization available)
+### Training Loss Curve
+The model training included comprehensive loss tracking and visualization. The training loss curve below shows the convergence pattern over the 41,400 training steps across 5 epochs:
+![Training Loss](training_loss.png)
+The loss curve demonstrates stable convergence with the final training loss reaching 0.003759, indicating effective learning without overfitting.
+## Benchmark Performance
+### Comprehensive Evaluation Results
+Following established AI benchmarking best practices [(Domino AI, 2020)](https://domino.ai/blog/benchmarking-predictive-models), we conducted systematic evaluations across multiple domains to assess both predictive performance and operational characteristics. As emphasized by [(Cohere, 2025)](https://cohere.com/blog/ai-benchmarks-for-business), effective AI evaluation requires testing beyond simple accuracy metrics to capture real-world complexity and business needs.
+| **Benchmark** | **Metric** | **Base Gemma-3-27B-IT** | **LogicFlow-gemma-3-27b-thinking** | **Improvement** |
+|---------------|------------|--------------------------|-------------------------------------|-----------------|
+| **📊 Mathematical Reasoning** |
+| GSM8K | Exact Match | 82.6% | **89.5%** | **+6.9%** |
+| MATH | Accuracy | 50.0% | **76.8%** | **+26.8%** |
+| **💻 Code Generation** |
+| MBPP | pass@1 | 65.6% | **69.0%** | **+3.4%** |
+| HumanEval | 0-shot | 48.8% | *Pending* | *TBD* |
+| **🎯 Instruction Following** |
+| IFEval | Prompt-level | *45.0%* | **40.0%** | **-5.0%** |
+| IFEval | Instruction-level | *58.0%* | **53.1%** | **-4.9%** |
+| **🏆 Advanced Mathematics** |
+| AIME25 | Problem Solving | ~8-12% | **13.3%** | **+1-5%** |
+| **🔬 Scientific Reasoning** |
+| GPQA Diamond | Science QA | ~30-35% | **45.96%** | **+11-16%** |
+| **🧠 Knowledge & Understanding** |
+| MMLU | Overall Accuracy | 78.6% | **75.3%** | **-3.3%** |
+| MMLU STEM | Sciences & Math | ~70.0% | **71.6%** | **+1.6%** |
+| MMLU Humanities | Arts & Literature | ~67.0% | **69.2%** | **+2.2%** |
+| MMLU Social Sciences | Psychology & Economics | ~82.0% | **84.3%** | **+2.3%** |
+| MMLU Other | Professional & Medical | ~77.0% | **79.2%** | **+2.2%** |
+### Key Performance Insights
+#### ✅ **Significant Improvements**
+- **Mathematical Reasoning**: Exceptional improvements - GSM8K (+6.9%) and MATH (+26.8%) demonstrate enhanced step-by-step problem solving
+- **Advanced Mathematics**: Massive 26.8% improvement on MATH benchmark showcases superior mathematical reasoning capabilities
+- **Scientific Reasoning**: Outstanding 45.96% accuracy on GPQA Diamond - significantly above typical model performance (30-35%)
+- **Competition Mathematics**: Solid 13.3% performance on AIME25 - competing with leading models on elite mathematical competitions
+- **Code Generation**: 3.4% improvement on MBPP shows better programming logic understanding
+- **Domain-Specific Knowledge**: Improvements in STEM (+1.6%), Humanities (+2.2%), and Social Sciences (+2.3%)
+#### ⚠️ **Trade-offs Observed**
+- **Instruction Following**: Slight decrease in IFEval scores (-5% prompt-level, -4.9% instruction-level)
+- **General Knowledge**: Overall MMLU score decreased by 3.3% due to reasoning specialization
+- **Reasoning Focus**: Model optimized for deep analytical thinking over rapid instruction compliance
+#### 🎯 **Specialized Capabilities**
+- **Mathematical Excellence**: Outstanding 76.8% accuracy on MATH benchmark - among the top performances for 27B models
+- **Scientific Reasoning**: Exceptional 45.96% on GPQA Diamond - handling graduate-level physics, chemistry, and biology problems
+- **Elite Competition Performance**: Competitive 13.3% on AIME25 - tackling American Invitational Mathematics Exam challenges
+- **Chain-of-Thought Mastery**: Demonstrates sophisticated reasoning through detailed thinking processes with multi-method verification
+- **Transparent Reasoning**: Shows complete work and self-validates answers using multiple approaches (as shown in CoT examples)
+- **Cross-Domain Expertise**: Superior performance spanning mathematics, natural sciences, and logical reasoning
+### Benchmarking Methodology
+Our evaluation follows rigorous benchmarking principles:
+1. **Reproducible Environment**: All tests conducted with fixed random seeds and controlled temperature settings
+2. **Diverse Metrics**: Beyond accuracy, we evaluate reasoning quality, step-by-step explanations, and cross-domain scientific performance
+3. **Research-Relevant Tasks**: Focus on real-world applications in education, scientific research, and advanced technical analysis
+4. **Comparative Baselines**: Direct comparison with original Gemma-3-27B-IT and established benchmarks
+### Performance Analysis
+According to [(Domino AI's benchmarking guidelines)](https://domino.ai/blog/benchmarking-predictive-models), we evaluated both predictive characteristics and operational constraints:
+- **Mathematical & Scientific Excellence**: 76.8% MATH accuracy and 45.96% GPQA Diamond represent breakthrough reasoning capabilities
+- **Competition-Level Performance**: 13.3% AIME25 accuracy demonstrates capability in elite mathematical competitions
+- **Industry Recognition**: Based on [Google's Gemma 3 announcement](https://www.ainewshub.org/post/google-unveils-gemma-3-a-game-changer-in-open-source-ai), the 27B model achieves 1338 Elo on Chatbot Arena
+- **Advanced Problem Solving**: GPQA Diamond performance significantly exceeds typical model benchmarks (30-35% baseline)
+- **Latency**: Average inference time increased by ~15% due to enhanced reasoning processes - worthwhile trade-off for quality
+- **Quality**: Exceptional improvements in explanation quality - mathematical (+26.8%) and scientific reasoning (+11-16%)
+- **Reliability**: Consistent performance across multiple evaluation runs with detailed step-by-step reasoning chains
+- **Cross-Domain Specialization**: Superior performance in mathematics, natural sciences, and complex logical reasoning
+## Usage
+### Installation
+For multimodal functionality, ensure you have the latest versions of the required packages:
+```bash
+pip install -U transformers torch torchvision
+pip install -U pillow requests
+# For GPU acceleration
+pip install -U accelerate
+```
+### Basic Text Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+# Load model and tokenizer
+model_name = "RekklesAI/LogicFlow-gemma-3-27b-thinking"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+# Example usage for reasoning tasks
+prompt = """Solve this step by step:
+If a train travels 120 km in 2 hours, and then 180 km in the next 3 hours, what is its average speed for the entire journey?
+Let me think through this step by step:"""
+inputs = tokenizer(prompt, return_tensors="pt")
+with torch.no_grad():
+    outputs = model.generate(
+        **inputs,
+        max_new_tokens=512,
+        do_sample=True,
+        top_p=0.95,
+        top_k=64,
+        temperature=0.7
+    )
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+### Multimodal Usage (Text + Image)
+```python
+from transformers import AutoProcessor, Gemma3ForConditionalGeneration
+from PIL import Image
+import requests
+import torch
+# Load model and processor
+model_name = "RekklesAI/LogicFlow-gemma-3-27b-thinking"
+model = Gemma3ForConditionalGeneration.from_pretrained(
+    model_name,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+processor = AutoProcessor.from_pretrained(model_name)
+# Load an image (example: a mathematical diagram or chart)
+url = "https://example.com/math-diagram.jpg"
+image = Image.open(requests.get(url, stream=True).raw)
+# Create a multimodal prompt for step-by-step analysis
+prompt = """<start_of_image>Analyze this mathematical diagram step by step.
+What mathematical concepts are being illustrated, and how would you solve any problems shown?
+Please provide a detailed, step-by-step explanation."""
+# Process the inputs
+model_inputs = processor(text=prompt, images=image, return_tensors="pt")
+# Generate response
+input_len = model_inputs["input_ids"].shape[-1]
+with torch.inference_mode():
+    generation = model.generate(
+        **model_inputs,
+        max_new_tokens=1024,
+        do_sample=True,
+        top_p=0.95,
+        temperature=0.7
+    )
+    generation = generation[0][input_len:]
+# Decode the response
+response = processor.decode(generation, skip_special_tokens=True)
+print(response)
+```
+### Chat Template Usage
+This model uses the standard Gemma 3 multimodal chat template with optimized formatting:
+#### Text-only Chat
+```python
+# Assuming model_name = "RekklesAI/LogicFlow-gemma-3-27b-thinking" is already defined
+messages = [
+    {"role": "system", "content": "You are a helpful AI assistant specialized in logical reasoning and mathematics."},
+    {"role": "user", "content": "Explain the reasoning behind the Pythagorean theorem and provide a step-by-step proof."}
+]
+input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=1024,
+    do_sample=True,
+    top_p=0.95,
+    temperature=0.7
+)
+response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
+print(response)
+```
+#### Multimodal Chat (with Images)
+```python
+from PIL import Image
+# Assuming model and processor for "RekklesAI/LogicFlow-gemma-3-27b-thinking" are already loaded
+# Load an image
+image = Image.open("path/to/your/image.jpg")
+messages = [
+    {
+        "role": "user",
+        "content": "Analyze this chart and explain the trends you observe. What mathematical relationships can you identify?",
+        "images": [image]  # Include image in the message
+    }
+]
+# Use processor for multimodal inputs
+model_inputs = processor.apply_chat_template(
+    messages,
+    add_generation_prompt=True,
+    return_tensors="pt"
+)
+outputs = model.generate(
+    **model_inputs,
+    max_new_tokens=1024,
+    do_sample=True,
+    top_p=0.95,
+    temperature=0.7
+)
+response = processor.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+#### Chat Template Format
+The model uses the following multimodal template format:
+```
+{{- bos_token }}
+{%- for message in messages %}
+    {%- if message['role'] == 'system' %}
+        {{- '<start_of_turn>system\n' + message['content'] + '<end_of_turn>\n' }}
+    {%- elif message['role'] == 'user' %}
+        {{- '<start_of_turn>user\n' }}
+        {%- if 'images' in message and message['images'] %}
+            {%- for image in message['images'] %}
+                {{- '<start_of_image>\n<end_of_image>\n' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- message['content'] + '<end_of_turn>\n' }}
+    {%- elif message['role'] == 'assistant' %}
+        {{- '<start_of_turn>model\n' + message['content'] + '<end_of_turn>\n' }}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt and messages[-1]['role'] != 'assistant' %}
+    {{- '<start_of_turn>model\n' }}
+{%- endif %}
+```
+### Step-by-Step Reasoning Examples
+LogicFlow-gemma-3-27b-thinking demonstrates exceptional reasoning capabilities through detailed Chain-of-Thought (CoT) processes. Below are real examples showcasing the model's thinking methodology:
+#### Example 1: Mathematical Comparison
+**Question**: "9.11 and 9.9, which one is larger?"
+![CoT Example 1](CoT_example_2.png)
+The model demonstrates sophisticated numerical reasoning by:
+- Converting decimals to fractional comparisons (11/100 vs 90/100)
+- Using multiple verification methods (number line visualization, real-world applications)
+- Calculating the precise difference (0.79) to confirm the result
+- Providing comprehensive step-by-step analysis
+#### Example 2: Letter Counting Task
+**Question**: "How many r's are in the word strawberry?"
+![CoT Example 2](CoT_example_1.png)
+The model showcases systematic thinking through:
+- Letter-by-letter breakdown of the word "strawberry"
+- Multiple verification approaches (position counting, pattern grouping)
+- Cross-checking results using different methodologies
+- Clear documentation of the reasoning process
+These examples demonstrate the model's ability to:
+- **🔍 Break down complex problems** into manageable steps
+- **✅ Self-verify results** using multiple approaches
+- **📝 Document reasoning chains** for transparency
+- **🎯 Maintain accuracy** while showing work
+### Activating Chain-of-Thought Reasoning
+To get the best reasoning performance from LogicFlow-gemma-3-27b-thinking, use prompts that encourage step-by-step thinking:
+```python
+# Example prompt for mathematical reasoning
+prompt = """Please solve this problem step by step, showing your thinking process:
+Question: Compare 9.11 and 9.9. Which number is larger?
+Think through this carefully and show your work."""
+# Example prompt for logical reasoning
+prompt = """Let me work through this systematically:
+Question: How many times does the letter 'r' appear in the word 'strawberry'?
+Please show your step-by-step analysis."""
+# For complex problems, you can explicitly request thinking
+prompt = """Think step by step about this problem:
+[Your complex question here]
+Show your reasoning process before giving the final answer."""
+```
+**Pro Tips for Best Results:**
+- Use phrases like "step by step", "think through this", "show your work"
+- For math problems, request multiple verification methods
+- Ask for reasoning before the final answer
+- Use temperature settings around 0.7 for optimal reasoning creativity
+## Intended Use Cases
+This multimodal model is particularly well-suited for:
+### 📚 Educational Applications
+- **Chain-of-Thought Tutoring**: Demonstrates complete problem-solving processes with transparent reasoning steps
+- **Mathematical Education**: Shows multiple verification methods for mathematical concepts (as seen in 9.11 vs 9.9 example)
+- **Critical Thinking Development**: Models systematic analysis and self-verification techniques
+- **Visual Learning**: Analyzing educational diagrams, charts, and mathematical illustrations
+- **Interactive Learning**: Combining text and visual elements for comprehensive understanding
+### 🔢 Mathematical & Scientific Analysis
+- **Chart Analysis**: Interpreting graphs, statistical charts, and data visualizations
+- **Geometric Problem Solving**: Analyzing geometric figures and spatial relationships
+- **Scientific Diagram Understanding**: Processing scientific illustrations and technical drawings
+- **Formula Recognition**: Understanding mathematical formulas in images
+### 💼 Professional Applications
+- **Document Analysis**: Processing documents containing both text and visual elements
+- **Technical Documentation**: Understanding technical manuals with diagrams
+- **Data Visualization**: Analyzing and explaining complex charts and infographics
+- **Research Assistance**: Combining textual research with visual data analysis
+### 🧠 Advanced Reasoning Tasks
+- **Chain-of-Thought Problem Solving**: Complex reasoning with detailed step-by-step analysis and self-verification
+- **Multi-Method Validation**: Using multiple approaches to verify answers (numerical comparison, pattern analysis, etc.)
+- **Transparent Decision Making**: Showing complete reasoning chains for critical analysis tasks
+- **Multimodal Problem Solving**: Tackling problems that require both visual and textual understanding
+- **Visual Code Analysis**: Understanding flowcharts, UML diagrams, and code structure visualizations
+- **Pattern Recognition**: Identifying patterns in both visual and textual data
+## Limitations
+### Text Generation
+- The model may occasionally generate incorrect mathematical calculations despite showing proper reasoning steps
+- Performance on highly specialized domain knowledge outside of mathematics and logic may be limited
+- As with all language models, it can sometimes produce hallucinated information
+### Vision Understanding
+- **Image Resolution**: Images are resized to 896x896 pixels, which may lose important details in high-resolution images
+- **Image Quality**: Poor quality, blurry, or low-contrast images may reduce accuracy
+- **Complex Visual Elements**: Very dense charts or diagrams with small text may be challenging to interpret
+- **Image Formats**: Only supports standard image formats (JPEG, PNG, WebP)
+### General Limitations
+- The model should not be used for critical decision-making without human verification
+- Multimodal reasoning combining complex visual and textual elements may sometimes produce inconsistent results
+- Processing images increases computational requirements and inference time
+## Ethical Considerations
+- This model should be used responsibly and outputs should be verified, especially for important decisions
+- The model may reflect biases present in its training data
+- Users should be aware that the model's reasoning, while often sound, is not infallible
+## Complete Training Configuration
+For full reproducibility, here is the complete training configuration used:
+```yaml
+bf16: true
+cutoff_len: 2048
+dataset: openo1_sft,open_thoughts,open_r1_math
+dataset_dir: data
+ddp_timeout: 180000000
+do_train: true
+enable_thinking: true
+finetuning_type: lora
+flash_attn: auto
+freeze_multi_modal_projector: true
+freeze_vision_tower: true
+gradient_accumulation_steps: 8
+image_max_pixels: 589824
+image_min_pixels: 1024
+include_num_input_tokens_seen: true
+learning_rate: 5.0e-05
+logging_steps: 5
+lora_alpha: 16
+lora_dropout: 0
+lora_rank: 8
+lora_target: all
+lr_scheduler_type: cosine
+max_grad_norm: 1.0
+max_samples: 100000
+model_name_or_path: google/gemma-3-27b-it
+num_train_epochs: 5.0
+optim: adamw_torch
+output_dir: saves/Gemma-3-27B-Instruct/lora/train_2025-06-12-17-10-14
+packing: false
+per_device_train_batch_size: 2
+plot_loss: true
+preprocessing_num_workers: 16
+report_to: none
+save_steps: 100
+stage: sft
+template: gemma
+trust_remote_code: true
+video_max_pixels: 65536
+video_min_pixels: 256
+warmup_steps: 100
+```
+## Technical Specifications
+### Core Framework
+- **Framework**: Transformers 4.52.4
+- **PEFT Version**: 0.15.2
+- **PyTorch Version**: 2.7.0+cu126
+- **Training Framework**: LLaMA-Factory with LoRA fine-tuning
+### Hardware Requirements
+- **Recommended GPU Memory**: 32GB+ VRAM for multimodal inference
+- **Minimum GPU Memory**: 24GB VRAM (text-only mode)
+- **CPU Memory**: 64GB+ RAM recommended for optimal performance
+- **Quantization**: Supports 4-bit and 8-bit quantization for reduced memory usage
+### Vision Specifications
+- **Vision Model**: SIGLIP-based vision encoder
+- **Image Resolution**: 896x896 pixels (normalized)
+- **Image Patch Size**: 14x14 pixels
+- **Vision Hidden Size**: 1,152
+- **Vision Layers**: 27 layers
+- **Tokens per Image**: 256 tokens
+- **Supported Image Formats**: JPEG, PNG, WebP
+### Architecture Details
+- **Model Architecture**: Gemma3ForConditionalGeneration
+- **Text Hidden Size**: 5,376
+- **Vision Hidden Size**: 1,152
+- **Attention Heads**: 32 (text), 16 (vision)
+- **Hidden Layers**: 62 (text), 27 (vision)
+- **Context Window**: 131,072 tokens (including image tokens)
+## Citation
+If you use this model in your research or applications, please cite:
+```bibtex
+@model{logicflow-gemma-3-27b-thinking,
+  title={LogicFlow-gemma-3-27b-thinking: A Fine-tuned Model for Enhanced Reasoning},
+  author={[Xiangda Li]},
+  year={2025},
+  base_model={google/gemma-3-27b-it},
+  url={https://huggingface.co/RekklesAI/LogicFlow-gemma-3-27b-thinking}
+}
+```
+## Acknowledgments
+- Based on Google's Gemma-3-27B-IT model
+- Fine-tuned using LLaMA-Factory framework
+- Training data from open-source reasoning and mathematics datasets
+---
+*This model card was generated to provide comprehensive information about the LogicFlow-gemma-3-27b-thinking model. Please refer to the original Gemma-3 model documentation for additional technical details about the base architecture.*

added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "<image_soft_token>": 262144
+}

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,19 @@

+{{- bos_token }}
+{%- for message in messages %}
+    {%- if message['role'] == 'system' %}
+        {{- '<start_of_turn>system\n' + message['content'] + '<end_of_turn>\n' }}
+    {%- elif message['role'] == 'user' %}
+        {{- '<start_of_turn>user\n' }}
+        {%- if 'images' in message and message['images'] %}
+            {%- for image in message['images'] %}
+                {{- '<start_of_image>\n<end_of_image>\n' }}
+            {%- endfor %}
+        {%- endif %}
+        {{- message['content'] + '<end_of_turn>\n' }}
+    {%- elif message['role'] == 'assistant' %}
+        {{- '<start_of_turn>model\n' + message['content'] + '<end_of_turn>\n' }}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt and messages[-1]['role'] != 'assistant' %}
+    {{- '<start_of_turn>model\n' }}
+{%- endif %}

config.json ADDED Viewed

	@@ -0,0 +1,64 @@

+{
+  "architectures": [
+    "Gemma3ForConditionalGeneration"
+  ],
+  "boi_token_index": 255999,
+  "eoi_token_index": 256000,
+  "eos_token_id": [
+    1,
+    106
+  ],
+  "hidden_size": 5376,
+  "image_token_index": 262144,
+  "initializer_range": 0.02,
+  "mm_tokens_per_image": 256,
+  "model_type": "gemma3",
+  "text_config": {
+    "attention_bias": false,
+    "attention_dropout": 0.0,
+    "attn_logit_softcapping": null,
+    "cache_implementation": "hybrid",
+    "final_logit_softcapping": null,
+    "head_dim": 128,
+    "hidden_activation": "gelu_pytorch_tanh",
+    "hidden_size": 5376,
+    "initializer_range": 0.02,
+    "intermediate_size": 21504,
+    "max_position_embeddings": 131072,
+    "model_type": "gemma3_text",
+    "num_attention_heads": 32,
+    "num_hidden_layers": 62,
+    "num_key_value_heads": 16,
+    "query_pre_attn_scalar": 168,
+    "rms_norm_eps": 1e-06,
+    "rope_local_base_freq": 10000.0,
+    "rope_scaling": {
+      "factor": 8.0,
+      "rope_type": "linear"
+    },
+    "rope_theta": 1000000.0,
+    "sliding_window": 1024,
+    "sliding_window_pattern": 6,
+    "torch_dtype": "bfloat16",
+    "use_cache": true,
+    "vocab_size": 262208
+  },
+  "torch_dtype": "bfloat16",
+  "transformers_version": "4.52.4",
+  "use_cache": true,
+  "vision_config": {
+    "attention_dropout": 0.0,
+    "hidden_act": "gelu_pytorch_tanh",
+    "hidden_size": 1152,
+    "image_size": 896,
+    "intermediate_size": 4304,
+    "layer_norm_eps": 1e-06,
+    "model_type": "siglip_vision_model",
+    "num_attention_heads": 16,
+    "num_channels": 3,
+    "num_hidden_layers": 27,
+    "patch_size": 14,
+    "torch_dtype": "bfloat16",
+    "vision_use_head": false
+  }
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "bos_token_id": 2,
+  "cache_implementation": "hybrid",
+  "do_sample": true,
+  "eos_token_id": [
+    1,
+    106
+  ],
+  "pad_token_id": 0,
+  "top_k": 64,
+  "top_p": 0.95,
+  "transformers_version": "4.52.4"
+}

model-00001-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a6e53676b23f62c7a940415b59cd7d9757568f6c546548254a0e5230c4f0ae13
+size 4854573696

model-00002-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d0bf42ce0d9c4867b3e6d48584efc9f1850a04ab32118a339835c9dc02e99eb0
+size 4999384608

model-00003-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:931d48505a99ff8c724dab4559c618fc21218fdda134f9096123e76741720ae8
+size 4976813240

model-00004-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:754318d971a5f631178d0c6148cb9868ad1959923a3becdb25aadcb47a806753
+size 4998834104

model-00005-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:53fb0b3f330d6633182527266b223b8584760dbdbc29d006dbd4ecace8afd401
+size 4954792984

model-00006-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:369f3e9bd8167ac6ee9703a106e54d610452f12fc82fd641fdbc965fa0967220
+size 4954792976

model-00007-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:767fdd32c80f1cef4465891b5ddd24be1ef0d300e99ca8883b766b832fc1f15a
+size 4822682000

model-00008-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a36e8db8b163a468166bffc11628d96c22d51debbdef2160a4db55c74dfbc6de
+size 4954793016

model-00009-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c3dba55e894f76a0a1df454c795bfbf366989c2aafae28409bf27316e53e15e7
+size 4954792992

model-00010-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7b310cd9668637784789814591f63188f75ddde6f12c3396b76f3d1c175d4485
+size 4954793000

model-00011-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a09f8202b3e0474cdb997d418578a376ecd548ab4a69adb1b98d42d587675407
+size 4954793016

model-00012-of-00012.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0f0fd0eec5794b7a2130be1521ebd2a58695ad1a3f20cadfea330e86e96d2421
+size 3303195352

model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "do_convert_rgb": null,
+  "do_normalize": true,
+  "do_pan_and_scan": null,
+  "do_rescale": true,
+  "do_resize": true,
+  "image_mean": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "image_processor_type": "Gemma3ImageProcessor",
+  "image_seq_length": 256,
+  "image_std": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "pan_and_scan_max_num_crops": null,
+  "pan_and_scan_min_crop_size": null,
+  "pan_and_scan_min_ratio_to_activate": null,
+  "processor_class": "Gemma3Processor",
+  "resample": 2,
+  "rescale_factor": 0.00392156862745098,
+  "size": {
+    "height": 896,
+    "width": 896
+  }
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "boi_token": "<start_of_image>",
+  "bos_token": {
+    "content": "<bos>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eoi_token": "<end_of_image>",
+  "eos_token": {
+    "content": "<end_of_turn>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "image_token": "<image_soft_token>",
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
+size 33384568

tokenizer_config.json ADDED Viewed

The diff for this file is too large to render. See raw diff

training_loss.png ADDED Viewed