LovelyBuggies's picture
Add model README
8180432 verified
metadata
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-3B
tags:
  - code
  - humaneval
  - multi-agent
  - mlgrpo
  - qwen2.5
library_name: transformers
pipeline_tag: text-generation

2xQwen2.5-Coder-3B-Satyr-Aux

This model is a fine-tuned version of Qwen/Qwen2.5-Coder-3B using Multi-LLM Group Relative Policy Optimization (MAGRPO) on HumanEval dataset.