|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
|
|
|
|
|
|
## Model Description |
|
|
|
This is the SFT model in our Mixture of Agents Alignment (MoAA) pipeline. This model is tuned on the Gemma-2-9b-it. MoAA is an approach that leverages collective intelligence from open‑source LLMs to advance alignment. |
|
|
|
Two mains stages are involved in our MoAA method. In the first stage, we employ MoA to produce high-quality synthetic data for supervised fine-tuning. In the second stage, we combines multiple LLMs as a reward model to provide preference annotations. |
|
|
|
Some key takeaways of our work: |
|
|
|
|
|
|
|
- 📈**Alignment pipeline that actually works** Our MoAA method sends Llama‑3.1‑8B‑Instruct’s Arena‑Hard **19 → 48** and Gemma-2-9B-it **42→56**, handily beating GPT‑4o‑labeled sets at the time. |
|
|
|
- 🏆**Ensembled rewards > single critics** An MoA reward model with dynamic criteria filtering edges out competitive ArmoRM on MT‑Bench & Arena‑Hard—all while staying 100 % open source. |
|
|
|
- 🚀**Self‑improvement unlocked** Fine‑tune the strongest model inside the ensemble on MoAA data and it *surpasses its own teachers*—evidence that open models can push past proprietary ceilings without external supervision. |
|
|
|
|
|
## Model Sources |
|
|
|
|
|
For more details refer to |
|
|
|
- **[Paper](https://arxiv.org/abs/2505.03059)** |
|
|
|
<!-- - **[twitter](https://arxiv.org/abs/2505.03059)** |
|
- **[blgopost](https://arxiv.org/abs/2505.03059)** --> |
|
|
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
Run inference like this: |
|
``` |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("togethercomputer/gemma-2-9b-it-MoAA-SFT") |
|
model = AutoModelForCausalLM.from_pretrained("togethercomputer/gemma-2-9b-it-MoAA-SFT") |
|
``` |
|
|
|
## Training Details |
|
|
|
## Training Data |
|
|
|
Training data are located here: https://huggingface.co/datasets/togethercomputer/MoAA-SFT. |
|
We subsample from two widely-used open-source instruction tuning datasets: UltraFeedback and UltraChat. Our subsampling strategy involves utilizing the entire UltraFeedback dataset and randomly selecting 5,000 samples from UltraChat. |
|
We use MoA to generate responses. The proposers used in our study are WizardLM-2-8x22b, Gemma-2-7b-it, Qwen-2-72b-Instruct, and Llama-3.1-70b-Instruct, while Qwen-1.5-110b-Instruct serves as the aggregator. |
|
|
|
|
|
## Evaluation & Performance |
|
|
|
Refer to [Paper](https://arxiv.org/abs/2505.03059) for metrics. |
|
|
|
|
|
|
|
|
|
|
|
## Citation |
|
``` |
|
@article{wang2025improving, |
|
title = {Improving Model Alignment Through Collective Intelligence of Open-Source LLMS}, |
|
author = {Junlin Wang and Roy Xie and Shang Zhu and Jue Wang and Ben Athiwaratkun and Bhuwan Dhingra and Shuaiwen Leon Song and Ce Zhang and James Zou}, |
|
year = {2025}, |
|
journal = {arXiv preprint arXiv: 2505.03059} |
|
} |
|
``` |