amazingvince commited on
Commit
0c608d2
1 Parent(s): de8f857

End of training

Browse files
Files changed (4) hide show
  1. README.md +3 -1
  2. all_results.json +10 -0
  3. train_results.json +10 -0
  4. trainer_state.json +0 -0
README.md CHANGED
@@ -1,5 +1,7 @@
1
  ---
2
  library_name: transformers
 
 
3
  license: apache-2.0
4
  base_model: BEE-spoke-data/tFINE-680m-e32-d16-gqa-1024
5
  tags:
@@ -14,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # tFINE-680m-e32-d16-gqa-1024-flan-subsets-deduped-1024
16
 
17
- This model is a fine-tuned version of [BEE-spoke-data/tFINE-680m-e32-d16-gqa-1024](https://huggingface.co/BEE-spoke-data/tFINE-680m-e32-d16-gqa-1024) on an unknown dataset.
18
 
19
  ## Model description
20
 
 
1
  ---
2
  library_name: transformers
3
+ language:
4
+ - en
5
  license: apache-2.0
6
  base_model: BEE-spoke-data/tFINE-680m-e32-d16-gqa-1024
7
  tags:
 
16
 
17
  # tFINE-680m-e32-d16-gqa-1024-flan-subsets-deduped-1024
18
 
19
+ This model is a fine-tuned version of [BEE-spoke-data/tFINE-680m-e32-d16-gqa-1024](https://huggingface.co/BEE-spoke-data/tFINE-680m-e32-d16-gqa-1024) on the pszemraj/flan-subsets-deduped dataset.
20
 
21
  ## Model description
22
 
all_results.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 0.9999466719486831,
3
+ "num_input_tokens_seen": 2313940996,
4
+ "total_flos": 9.029436409798197e+18,
5
+ "train_loss": 0.7368571982491552,
6
+ "train_runtime": 130509.2663,
7
+ "train_samples": 4200414,
8
+ "train_samples_per_second": 32.185,
9
+ "train_steps_per_second": 0.126
10
+ }
train_results.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 0.9999466719486831,
3
+ "num_input_tokens_seen": 2313940996,
4
+ "total_flos": 9.029436409798197e+18,
5
+ "train_loss": 0.7368571982491552,
6
+ "train_runtime": 130509.2663,
7
+ "train_samples": 4200414,
8
+ "train_samples_per_second": 32.185,
9
+ "train_steps_per_second": 0.126
10
+ }
trainer_state.json ADDED
The diff for this file is too large to render. See raw diff