Model save

Browse files

Files changed (7) hide show

README.md +108 -0
all_results.json +9 -0
generation_config.json +7 -0
model.safetensors +1 -1
runs/Jun02_10-42-23_poseidon/events.out.tfevents.1717325399.poseidon.2371184.0 +2 -2
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,108 @@

+---
+license: apache-2.0
+base_model: martimfasantos/tinyllama-1.1b-chat-sft-full
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: tinyllama-1.1b-chat-dpo-full
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# tinyllama-1.1b-chat-dpo-full
+This model is a fine-tuned version of [martimfasantos/tinyllama-1.1b-chat-sft-full](https://huggingface.co/martimfasantos/tinyllama-1.1b-chat-sft-full) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5862
+- Rewards/chosen: -1.1608
+- Rewards/rejected: -1.6138
+- Rewards/accuracies: 0.6885
+- Rewards/margins: 0.4530
+- Logps/rejected: -458.4823
+- Logps/chosen: -452.2973
+- Logits/rejected: -2.3882
+- Logits/chosen: -2.4306
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-07
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.693         | 0.0262 | 100  | 0.6929          | -0.0014        | -0.0019          | 0.5320             | 0.0006          | -297.2994      | -336.3557    | -3.1228         | -3.1361       |
+| 0.6887        | 0.0523 | 200  | 0.6892          | -0.0302        | -0.0383          | 0.6160             | 0.0081          | -300.9348      | -339.2341    | -3.1215         | -3.1346       |
+| 0.6789        | 0.0785 | 300  | 0.6794          | -0.0789        | -0.1087          | 0.6360             | 0.0299          | -307.9798      | -344.1051    | -3.1094         | -3.1216       |
+| 0.6624        | 0.1047 | 400  | 0.6635          | -0.1807        | -0.2518          | 0.6390             | 0.0711          | -322.2854      | -354.2890    | -3.0664         | -3.0771       |
+| 0.6373        | 0.1309 | 500  | 0.6503          | -0.2988        | -0.4120          | 0.6425             | 0.1133          | -338.3080      | -366.0959    | -2.9693         | -2.9839       |
+| 0.6423        | 0.1570 | 600  | 0.6457          | -0.3891        | -0.5345          | 0.6375             | 0.1454          | -350.5518      | -375.1291    | -2.9372         | -2.9538       |
+| 0.6266        | 0.1832 | 700  | 0.6420          | -0.7030        | -0.9081          | 0.6365             | 0.2051          | -387.9123      | -406.5211    | -2.9095         | -2.9229       |
+| 0.5942        | 0.2094 | 800  | 0.6367          | -0.4969        | -0.6764          | 0.6475             | 0.1795          | -364.7484      | -385.9118    | -2.9255         | -2.9397       |
+| 0.6171        | 0.2355 | 900  | 0.6330          | -0.5389        | -0.7443          | 0.6545             | 0.2054          | -371.5351      | -390.1065    | -2.8815         | -2.8992       |
+| 0.6156        | 0.2617 | 1000 | 0.6271          | -0.9278        | -1.1788          | 0.6460             | 0.2510          | -414.9855      | -428.9975    | -2.8469         | -2.8665       |
+| 0.6636        | 0.2879 | 1100 | 0.6234          | -0.7984        | -1.0304          | 0.6515             | 0.2320          | -400.1489      | -416.0618    | -2.8144         | -2.8347       |
+| 0.6832        | 0.3141 | 1200 | 0.6152          | -1.0303        | -1.3170          | 0.6570             | 0.2866          | -428.8004      | -439.2536    | -2.7994         | -2.8212       |
+| 0.5967        | 0.3402 | 1300 | 0.6131          | -1.2342        | -1.5321          | 0.6655             | 0.2979          | -450.3198      | -459.6400    | -2.7494         | -2.7756       |
+| 0.596         | 0.3664 | 1400 | 0.6064          | -0.8587        | -1.1697          | 0.6820             | 0.3110          | -414.0766      | -422.0903    | -2.8084         | -2.8289       |
+| 0.592         | 0.3926 | 1500 | 0.6027          | -0.9689        | -1.3189          | 0.6715             | 0.3499          | -428.9929      | -433.1132    | -2.7455         | -2.7703       |
+| 0.6353        | 0.4187 | 1600 | 0.6051          | -0.9640        | -1.3223          | 0.6745             | 0.3582          | -429.3314      | -432.6226    | -2.6972         | -2.7245       |
+| 0.6603        | 0.4449 | 1700 | 0.6016          | -0.9893        | -1.3221          | 0.6765             | 0.3328          | -429.3145      | -435.1521    | -2.7021         | -2.7305       |
+| 0.5551        | 0.4711 | 1800 | 0.6023          | -1.0035        | -1.3765          | 0.6790             | 0.3731          | -434.7590      | -436.5641    | -2.6159         | -2.6492       |
+| 0.5877        | 0.4973 | 1900 | 0.5975          | -0.8137        | -1.1853          | 0.6835             | 0.3716          | -415.6308      | -417.5872    | -2.6621         | -2.6941       |
+| 0.5827        | 0.5234 | 2000 | 0.5935          | -0.8724        | -1.2562          | 0.6810             | 0.3838          | -422.7221      | -423.4575    | -2.6043         | -2.6396       |
+| 0.6017        | 0.5496 | 2100 | 0.5911          | -1.0065        | -1.3971          | 0.6905             | 0.3907          | -436.8172      | -436.8658    | -2.6105         | -2.6436       |
+| 0.5539        | 0.5758 | 2200 | 0.5920          | -0.9060        | -1.2945          | 0.6885             | 0.3884          | -426.5499      | -426.8195    | -2.5724         | -2.6076       |
+| 0.5795        | 0.6019 | 2300 | 0.5914          | -1.1164        | -1.5398          | 0.6865             | 0.4234          | -451.0841      | -447.8605    | -2.5399         | -2.5757       |
+| 0.5657        | 0.6281 | 2400 | 0.5904          | -1.0347        | -1.4494          | 0.6860             | 0.4147          | -442.0414      | -439.6861    | -2.5121         | -2.5487       |
+| 0.5306        | 0.6543 | 2500 | 0.5918          | -1.0464        | -1.4840          | 0.6825             | 0.4376          | -445.5005      | -440.8591    | -2.4692         | -2.5102       |
+| 0.5762        | 0.6805 | 2600 | 0.5927          | -1.0687        | -1.5141          | 0.6780             | 0.4455          | -448.5193      | -443.0862    | -2.4291         | -2.4735       |
+| 0.6016        | 0.7066 | 2700 | 0.5936          | -1.0767        | -1.5080          | 0.6800             | 0.4313          | -447.9063      | -443.8889    | -2.4329         | -2.4747       |
+| 0.6068        | 0.7328 | 2800 | 0.5897          | -1.1905        | -1.6433          | 0.6820             | 0.4527          | -461.4312      | -455.2722    | -2.4294         | -2.4708       |
+| 0.5821        | 0.7590 | 2900 | 0.5870          | -1.1245        | -1.5598          | 0.6845             | 0.4353          | -453.0833      | -448.6697    | -2.4470         | -2.4862       |
+| 0.5393        | 0.7851 | 3000 | 0.5873          | -1.2223        | -1.6710          | 0.6870             | 0.4486          | -464.2020      | -458.4521    | -2.4161         | -2.4565       |
+| 0.577         | 0.8113 | 3100 | 0.5886          | -1.1359        | -1.5757          | 0.6845             | 0.4399          | -454.6796      | -449.8056    | -2.4137         | -2.4538       |
+| 0.5731        | 0.8375 | 3200 | 0.5864          | -1.1928        | -1.6493          | 0.6900             | 0.4564          | -462.0313      | -455.5009    | -2.3988         | -2.4401       |
+| 0.586         | 0.8636 | 3300 | 0.5865          | -1.1740        | -1.6231          | 0.6895             | 0.4492          | -459.4178      | -453.6159    | -2.3969         | -2.4384       |
+| 0.5629        | 0.8898 | 3400 | 0.5860          | -1.1573        | -1.6086          | 0.6890             | 0.4513          | -457.9694      | -451.9486    | -2.3882         | -2.4306       |
+| 0.6059        | 0.9160 | 3500 | 0.5858          | -1.1672        | -1.6213          | 0.6890             | 0.4541          | -459.2307      | -452.9388    | -2.3897         | -2.4320       |
+| 0.5703        | 0.9422 | 3600 | 0.5860          | -1.1607        | -1.6138          | 0.6870             | 0.4532          | -458.4890      | -452.2865    | -2.3897         | -2.4320       |
+| 0.5533        | 0.9683 | 3700 | 0.5858          | -1.1623        | -1.6161          | 0.6880             | 0.4538          | -458.7165      | -452.4510    | -2.3882         | -2.4304       |
+| 0.5988        | 0.9945 | 3800 | 0.5862          | -1.1608        | -1.6138          | 0.6885             | 0.4530          | -458.4823      | -452.2973    | -2.3882         | -2.4306       |
+### Framework versions
+- Transformers 4.41.1
+- Pytorch 2.1.2
+- Datasets 2.19.1
+- Tokenizers 0.19.1

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 1.0,
+    "total_flos": 0.0,
+    "train_loss": 0.6060711596530634,
+    "train_runtime": 35916.4658,
+    "train_samples": 61134,
+    "train_samples_per_second": 1.702,
+    "train_steps_per_second": 0.106
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "max_length": 2048,
+  "pad_token_id": 0,
+  "transformers_version": "4.41.1"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f81273e2e4858ec7753e12a1a02a023440ab4fb209b3625f8c365c66e91c1204
 size 4400216536

 version https://git-lfs.github.com/spec/v1
+oid sha256:3792403f623109d4a40c553f4f959aba1949f3ba6457dda6779abaf241cf3fbf
 size 4400216536

runs/Jun02_10-42-23_poseidon/events.out.tfevents.1717325399.poseidon.2371184.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b72d835da0b4124b404817ee218908fbba50a3bc59d6eb7f9a999e95ee7a64bf
-size 295164

 version https://git-lfs.github.com/spec/v1
+oid sha256:bdf5f8fc6ec2c96bebe381c84e67d771608094dfd93217d7ad0563800185383a
+size 296894

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 1.0,
+    "total_flos": 0.0,
+    "train_loss": 0.6060711596530634,
+    "train_runtime": 35916.4658,
+    "train_samples": 61134,
+    "train_samples_per_second": 1.702,
+    "train_steps_per_second": 0.106
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff