RA_Reasoner 2.0

Model Details

Developed by: Daemontatox
License: Apache 2.0
Base Model: Daemontatox/RA_Reasoner

This model is fine-tuned from the Falcon-10B-Instruct model, leveraging advanced training optimizations to enhance reasoning and instruction-following capabilities. It was trained 2x faster using Unsloth and Hugging Face's TRL library.

Training Details

Frameworks Used: Unsloth, Hugging Face TRL
Fine-Tuning Focus: Emphasis on reasoning, logic-based tasks, and instruction comprehension.
Dataset: Includes examples from Daemontatox/Deepthinking-COT.
Optimization: Significant speedup during fine-tuning while maintaining model quality.

Further details on hyperparameters and fine-tuning methodology will be added in future updates.

Intended Use

This model is intended for research and development in text generation, reasoning tasks, and instruction-following applications.

Key Features:

Enhanced reasoning capabilities for multi-step logical problems.
Robust instruction-following for complex tasks.
Fine-tuned for Chain-of-Thought (COT) reasoning and inference.

Applications:

Research on reasoning-based AI systems.
Tasks requiring logical deductions, such as question answering and problem-solving.
General text generation with a focus on nuanced understanding.

Limitations and Warnings

This model is not designed for real-time or production-critical tasks.
Outputs may vary based on input specificity and complexity.
Users are responsible for ensuring ethical use and compliance with applicable regulations.

Acknowledgments

Base model: Daemontatox/RA_Reasoner
Training acceleration powered by Unsloth and Hugging Face's TRL library.
Dataset contributions: Daemontatox/Deepthinking-COT.

---# Open LLM Leaderboard Evaluation Results Detailed results can be found here! Summarized results can be found here!

Metric	Value (%)
Average	29.00
IFEval (0-Shot)	53.66
BBH (3-Shot)	43.07
MATH Lvl 5 (4-Shot)	22.89
GPQA (0-shot)	9.96
MuSR (0-shot)	7.18
MMLU-PRO (5-shot)	37.26

Daemontatox
/

RA_Reasoner2.0

RA_Reasoner 2.0

Model Details

Training Details

Intended Use

Key Features:

Applications:

Limitations and Warnings

Acknowledgments

Model tree for Daemontatox/RA_Reasoner2.0

Dataset used to train Daemontatox/RA_Reasoner2.0

Collection including Daemontatox/RA_Reasoner2.0

Reason/COT

Evaluation results