pkarypis commited on
Commit
757b94e
·
verified ·
1 Parent(s): 8573011

Model save

Browse files
README.md CHANGED
@@ -2,16 +2,11 @@
2
  license: apache-2.0
3
  base_model: JackFram/llama-68m
4
  tags:
5
- - alignment-handbook
6
  - trl
7
  - sft
8
  - generated_from_trainer
9
- - trl
10
- - sft
11
- - alignment-handbook
12
- - generated_from_trainer
13
  datasets:
14
- - HuggingFaceH4/ultrachat_200k
15
  model-index:
16
  - name: gpt2-sft-port
17
  results: []
@@ -22,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  # gpt2-sft-port
24
 
25
- This model is a fine-tuned version of [JackFram/llama-68m](https://huggingface.co/JackFram/llama-68m) on the HuggingFaceH4/ultrachat_200k dataset.
26
  It achieves the following results on the evaluation set:
27
  - Loss: 2.0739
28
 
@@ -60,8 +55,8 @@ The following hyperparameters were used during training:
60
 
61
  | Training Loss | Epoch | Step | Validation Loss |
62
  |:-------------:|:-----:|:----:|:---------------:|
63
- | 2.1065 | 1.0 | 2258 | 2.0952 |
64
- | 2.054 | 2.0 | 4516 | 2.0739 |
65
 
66
 
67
  ### Framework versions
 
2
  license: apache-2.0
3
  base_model: JackFram/llama-68m
4
  tags:
 
5
  - trl
6
  - sft
7
  - generated_from_trainer
 
 
 
 
8
  datasets:
9
+ - generator
10
  model-index:
11
  - name: gpt2-sft-port
12
  results: []
 
17
 
18
  # gpt2-sft-port
19
 
20
+ This model is a fine-tuned version of [JackFram/llama-68m](https://huggingface.co/JackFram/llama-68m) on the generator dataset.
21
  It achieves the following results on the evaluation set:
22
  - Loss: 2.0739
23
 
 
55
 
56
  | Training Loss | Epoch | Step | Validation Loss |
57
  |:-------------:|:-----:|:----:|:---------------:|
58
+ | 2.1085 | 1.0 | 2258 | 2.0943 |
59
+ | 2.0651 | 2.0 | 4516 | 2.0739 |
60
 
61
 
62
  ### Framework versions
all_results.json CHANGED
@@ -1,13 +1,8 @@
1
  {
2
  "epoch": 2.0,
3
- "eval_loss": 2.0739028453826904,
4
- "eval_runtime": 26.4578,
5
- "eval_samples": 23110,
6
- "eval_samples_per_second": 1208.678,
7
- "eval_steps_per_second": 37.796,
8
- "train_loss": 0.007307469739229727,
9
- "train_runtime": 35.2869,
10
  "train_samples": 207865,
11
- "train_samples_per_second": 16375.952,
12
- "train_steps_per_second": 127.98
13
  }
 
1
  {
2
  "epoch": 2.0,
3
+ "train_loss": 2.136630948372284,
4
+ "train_runtime": 1210.2491,
 
 
 
 
 
5
  "train_samples": 207865,
6
+ "train_samples_per_second": 477.469,
7
+ "train_steps_per_second": 3.731
8
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:542b73bb4d65c1177e469bb5c5a260bb95edf72d892979aa9a6722aeec596330
3
  size 136062744
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20282bf0b1fe66befbee9d70a4f659f0c4d2159f237369558d687798f419e103
3
  size 136062744
runs/Apr24_14-55-53_aga39/events.out.tfevents.1713988565.aga39.631152.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:34a9d15e155b7d2948e97b49ec19efae5cafb9cdfd1803ec810891400a558863
3
- size 194988
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec480a4e7180df1c4d0631c1a92383623ee894c82707d31ddd8cc16b900873ef
3
+ size 196246
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 2.0,
3
- "train_loss": 0.007307469739229727,
4
- "train_runtime": 35.2869,
5
  "train_samples": 207865,
6
- "train_samples_per_second": 16375.952,
7
- "train_steps_per_second": 127.98
8
  }
 
1
  {
2
  "epoch": 2.0,
3
+ "train_loss": 2.136630948372284,
4
+ "train_runtime": 1210.2491,
5
  "train_samples": 207865,
6
+ "train_samples_per_second": 477.469,
7
+ "train_steps_per_second": 3.731
8
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff