Model save

Files changed (7) hide show

README.md CHANGED Viewed

@@ -2,15 +2,11 @@
 license: apache-2.0
 base_model: JackFram/llama-68m
 tags:
-- alignment-handbook
-- trl
-- sft
-- generated_from_trainer
 - trl
 - sft
 - generated_from_trainer
 datasets:
-- HuggingFaceH4/ultrachat_200k
 model-index:
 - name: gpt2-sft-port
   results: []
@@ -21,9 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
 # gpt2-sft-port
-This model is a fine-tuned version of [JackFram/llama-68m](https://huggingface.co/JackFram/llama-68m) on the HuggingFaceH4/ultrachat_200k dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.0739
 ## Model description
@@ -47,9 +43,9 @@ The following hyperparameters were used during training:
 - eval_batch_size: 8
 - seed: 42
 - distributed_type: multi-GPU
-- num_devices: 4
-- total_train_batch_size: 128
-- total_eval_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
@@ -59,8 +55,8 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 2.1085        | 1.0   | 2258 | 2.0943          |
-| 2.0651        | 2.0   | 4516 | 2.0739          |
 ### Framework versions

 license: apache-2.0
 base_model: JackFram/llama-68m
 tags:
 - trl
 - sft
 - generated_from_trainer
 datasets:
+- generator
 model-index:
 - name: gpt2-sft-port
   results: []
 # gpt2-sft-port
+This model is a fine-tuned version of [JackFram/llama-68m](https://huggingface.co/JackFram/llama-68m) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.1067
 ## Model description
 - eval_batch_size: 8
 - seed: 42
 - distributed_type: multi-GPU
+- num_devices: 8
+- total_train_batch_size: 256
+- total_eval_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 2.1213        | 1.0   | 1129 | 2.1273          |
+| 2.0929        | 2.0   | 2258 | 2.1067          |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,13 +1,8 @@
 {
     "epoch": 2.0,
-    "eval_loss": 2.073899507522583,
-    "eval_runtime": 26.2955,
-    "eval_samples": 23110,
-    "eval_samples_per_second": 1216.139,
-    "eval_steps_per_second": 38.029,
-    "train_loss": 2.136630948372284,
-    "train_runtime": 1210.2491,
     "train_samples": 207865,
-    "train_samples_per_second": 477.469,
-    "train_steps_per_second": 3.731
 }

 {
     "epoch": 2.0,
+    "train_loss": 2.17739642542796,
+    "train_runtime": 641.6533,
     "train_samples": 207865,
+    "train_samples_per_second": 900.574,
+    "train_steps_per_second": 3.519
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3bd53bfc8f6bed8c963823ac774ec0d2e33cb8782ed7363cb8084ed925fed564
 size 136062744

 version https://git-lfs.github.com/spec/v1
+oid sha256:05162d09558019270998ed5c5fe53121ea8e093b25ac1a9d4becaccf5356e6b6
 size 136062744

runs/Apr24_16-39-01_aga39/events.out.tfevents.1713994753.aga39.645447.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:c5539baaaf5023aef2f3538e240adf4cafe1de9c7315aa80d1e471c8f7caf719
+size 100878

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 2.0,
-    "train_loss": 2.136630948372284,
-    "train_runtime": 1210.2491,
     "train_samples": 207865,
-    "train_samples_per_second": 477.469,
-    "train_steps_per_second": 3.731
 }

 {
     "epoch": 2.0,
+    "train_loss": 2.17739642542796,
+    "train_runtime": 641.6533,
     "train_samples": 207865,
+    "train_samples_per_second": 900.574,
+    "train_steps_per_second": 3.519
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a17dccee37de884ea4e6472fc2742ff9af92f2cc68f9ac51f033e585d3744a2a
 size 6072

 version https://git-lfs.github.com/spec/v1
+oid sha256:b265d5a384c5c1ec6439688f515216fcdb4a4736a63d3e75fbc4bf184c4bfc01
 size 6072