Post
2466
This shared notebook comprises the MMLU benchmark evaluating task for my latest reasoning model for the sociology field. The results show that using Few-shot prompting in the system prompt can significantly improve the model's performance at answering questions.
Model's link:
alibidaran/GRPO_LLAMA3-instructive_reasoning1
Notebook evaluation:
https://www.kaggle.com/code/alibidaran/mmlu-socialogy-thinking-evals?scriptVersionId=277240033
Model's link:
alibidaran/GRPO_LLAMA3-instructive_reasoning1
Notebook evaluation:
https://www.kaggle.com/code/alibidaran/mmlu-socialogy-thinking-evals?scriptVersionId=277240033