RA_REASONER

RA_Reasoner 2.0

Model Details

Developed by: Daemontatox
License: Apache 2.0
Base Model: Daemontatox/RA_Reasoner

This model is fine-tuned from the Falcon-10B-Instruct model, leveraging advanced training optimizations to enhance reasoning and instruction-following capabilities. It was trained 2x faster using Unsloth and Hugging Face's TRL library.


Training Details

  • Frameworks Used: Unsloth, Hugging Face TRL
  • Fine-Tuning Focus: Emphasis on reasoning, logic-based tasks, and instruction comprehension.
  • Dataset: Includes examples from Daemontatox/Deepthinking-COT.
  • Optimization: Significant speedup during fine-tuning while maintaining model quality.

Further details on hyperparameters and fine-tuning methodology will be added in future updates.


Intended Use

This model is intended for research and development in text generation, reasoning tasks, and instruction-following applications.

Key Features:

  • Enhanced reasoning capabilities for multi-step logical problems.
  • Robust instruction-following for complex tasks.
  • Fine-tuned for Chain-of-Thought (COT) reasoning and inference.

Applications:

  • Research on reasoning-based AI systems.
  • Tasks requiring logical deductions, such as question answering and problem-solving.
  • General text generation with a focus on nuanced understanding.

Limitations and Warnings

  • This model is not designed for real-time or production-critical tasks.
  • Outputs may vary based on input specificity and complexity.
  • Users are responsible for ensuring ethical use and compliance with applicable regulations.

Acknowledgments

---# Open LLM Leaderboard Evaluation Results Detailed results can be found here! Summarized results can be found here!

Metric Value (%)
Average 29.00
IFEval (0-Shot) 53.66
BBH (3-Shot) 43.07
MATH Lvl 5 (4-Shot) 22.89
GPQA (0-shot) 9.96
MuSR (0-shot) 7.18
MMLU-PRO (5-shot) 37.26
Downloads last month
41
Safetensors
Model size
10.3B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Daemontatox/RA_Reasoner2.0

Finetuned
(1)
this model
Quantizations
4 models

Dataset used to train Daemontatox/RA_Reasoner2.0

Collection including Daemontatox/RA_Reasoner2.0

Evaluation results