Model save

Browse files

Files changed (4) hide show

README.md +91 -0
all_results.json +9 -0
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,91 @@

+---
+license: apache-2.0
+base_model: mistralai/Mistral-7B-v0.1
+tags:
+- trl
+- orpo
+- generated_from_trainer
+library_name: peft
+model-index:
+- name: zephyr-7b-orpo-qlora-lr5e6-beta0.1
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/1014579852qq-tsinghua-university/huggingface/runs/97o9oz6g)
+# zephyr-7b-orpo-qlora-lr5e6-beta0.1
+This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.0573
+- Rewards/chosen: -0.0746
+- Rewards/rejected: -0.0911
+- Rewards/accuracies: 0.6168
+- Rewards/margins: 0.0165
+- Logps/rejected: -0.9108
+- Logps/chosen: -0.7456
+- Logits/rejected: -2.2160
+- Logits/chosen: -2.3146
+- Nll Loss: 1.0059
+- Log Odds Ratio: -0.6434
+- Log Odds Chosen: 0.2827
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 2
+- eval_batch_size: 4
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 5
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 40
+- total_eval_batch_size: 20
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
+| 1.2019        | 0.0723 | 100  | 1.1583          | -0.0797        | -0.0926          | 0.5842             | 0.0129          | -0.9264        | -0.7970      | -2.2900         | -2.3902       | 1.0948   | -0.6684        | 0.2079          |
+| 1.1793        | 0.1446 | 200  | 1.1165          | -0.0775        | -0.0918          | 0.6060             | 0.0143          | -0.9181        | -0.7754      | -2.2619         | -2.3628       | 1.0550   | -0.6553        | 0.2332          |
+| 1.1317        | 0.2170 | 300  | 1.0960          | -0.0758        | -0.0910          | 0.6060             | 0.0153          | -0.9104        | -0.7575      | -2.2513         | -2.3506       | 1.0367   | -0.6496        | 0.2630          |
+| 1.1023        | 0.2893 | 400  | 1.0837          | -0.0758        | -0.0914          | 0.6087             | 0.0156          | -0.9140        | -0.7583      | -2.2417         | -2.3391       | 1.0275   | -0.6482        | 0.2625          |
+| 1.1022        | 0.3616 | 500  | 1.0748          | -0.0751        | -0.0911          | 0.6168             | 0.0160          | -0.9110        | -0.7507      | -2.2335         | -2.3303       | 1.0204   | -0.6471        | 0.2753          |
+| 1.1102        | 0.4339 | 600  | 1.0691          | -0.0746        | -0.0908          | 0.6114             | 0.0162          | -0.9078        | -0.7461      | -2.2155         | -2.3130       | 1.0153   | -0.6447        | 0.2813          |
+| 1.0911        | 0.5062 | 700  | 1.0641          | -0.0746        | -0.0909          | 0.6114             | 0.0163          | -0.9089        | -0.7463      | -2.2246         | -2.3233       | 1.0116   | -0.6446        | 0.2801          |
+| 1.0863        | 0.5786 | 800  | 1.0610          | -0.0745        | -0.0912          | 0.6168             | 0.0167          | -0.9119        | -0.7445      | -2.2159         | -2.3155       | 1.0091   | -0.6425        | 0.2909          |
+| 1.099         | 0.6509 | 900  | 1.0589          | -0.0749        | -0.0914          | 0.6168             | 0.0165          | -0.9135        | -0.7485      | -2.2140         | -2.3129       | 1.0076   | -0.6436        | 0.2801          |
+| 1.067         | 0.7232 | 1000 | 1.0580          | -0.0745        | -0.0907          | 0.6168             | 0.0162          | -0.9071        | -0.7447      | -2.2185         | -2.3171       | 1.0064   | -0.6446        | 0.2788          |
+| 1.1264        | 0.7955 | 1100 | 1.0574          | -0.0748        | -0.0913          | 0.6141             | 0.0165          | -0.9129        | -0.7477      | -2.2184         | -2.3166       | 1.0061   | -0.6437        | 0.2809          |
+| 1.0909        | 0.8678 | 1200 | 1.0573          | -0.0746        | -0.0911          | 0.6168             | 0.0165          | -0.9108        | -0.7458      | -2.2147         | -2.3135       | 1.0059   | -0.6437        | 0.2825          |
+| 1.1175        | 0.9402 | 1300 | 1.0573          | -0.0746        | -0.0911          | 0.6168             | 0.0165          | -0.9108        | -0.7456      | -2.2160         | -2.3146       | 1.0059   | -0.6434        | 0.2827          |
+### Framework versions
+- PEFT 0.10.0
+- Transformers 4.43.1
+- Pytorch 2.1.2+cu121
+- Datasets 2.18.0
+- Tokenizers 0.19.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 0.9994576026035075,
+    "total_flos": 0.0,
+    "train_loss": 1.1206706619124682,
+    "train_runtime": 8808.4081,
+    "train_samples": 55305,
+    "train_samples_per_second": 6.279,
+    "train_steps_per_second": 0.157
+}

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 0.9994576026035075,
+    "total_flos": 0.0,
+    "train_loss": 1.1206706619124682,
+    "train_runtime": 8808.4081,
+    "train_samples": 55305,
+    "train_samples_per_second": 6.279,
+    "train_steps_per_second": 0.157
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff