File size: 2,355 Bytes
1a8184c 6ce8837 1a8184c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
license: apache-2.0
datasets:
- nicholasKluge/reward-aira-dataset
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- text-generation-inference
---
# Aira-2-124M-DPO-checkpoint-200
## Hyperparameters
```yaml
model_args:
base_model: "nicholasKluge/Aira-2-124M"
model_ref: "nicholasKluge/Aira-2-124M"
cache_dir: null
data_args:
dataset_name: "nicholasKluge/reward-aira-dataset"
dataset_split: "english"
validation_split_percentage: null
streaming: false
max_prompt_length: 150
max_length: 600
sanity_check: false
training_args:
output_dir: "checkpoints"
do_eval: false
evaluation_strategy: "no"
save_strategy: "steps"
logging_strategy: "steps"
logging_steps: 200
max_steps: 2400
save_steps: 200
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
gradient_accumulation_steps: 1
gradient_checkpointing: false
optim: "adamw_torch"
learning_rate: 0.00005
lr_scheduler_type: "cosine"
warmup_steps: 100
hub_token: null
push_to_hub: false
hub_model_id: null
extra_args:
project_name: "Aira-2"
wandb_token: null
beta: 0.8
```
## Logs
| Key | Value |
|-----------------------|---------------------------------|
| loss | 0.2274 |
| learning_rate | 4.976714865090827e-05 |
| rewards/chosen | -33.849693298339844 |
| rewards/rejected | -114.72045135498047 |
| rewards/accuracies | 0.9768750071525574 |
| rewards/margins | 80.87075805664062 |
| logps/rejected | -404.8834228515625 |
| logps/chosen | -383.7469482421875 |
| logits/rejected | -67.6454086303711 |
| logits/chosen | -30.543472290039062 |
| epoch | 0.05 |
## Eval
| Task |Version| Metric |Value | |Stderr|
|-------------|------:|--------|-----:|---|-----:|
|arc_challenge| 0|acc |0.2031|± |0.0118|
| | |acc_norm|0.2491|± |0.0126|
|toxigen | 0|acc |0.5521|± |0.0162|
| | |acc_norm|0.4340|± |0.0162|
|truthfulqa_mc| 1|mc1 |0.2485|± |0.0151|
| | |mc2 |0.4368|± |0.0153| |