FatCat87 commited on
Commit
3049028
·
verified ·
1 Parent(s): 52d3256

End of training

Browse files
Files changed (2) hide show
  1. README.md +22 -21
  2. adapter_model.bin +2 -2
README.md CHANGED
@@ -1,12 +1,12 @@
1
  ---
2
- license: mit
3
  library_name: peft
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
- base_model: microsoft/Phi-3.5-mini-instruct
8
  model-index:
9
- - name: bebc098a-34ef-4279-bfd4-76bb95526dba
10
  results: []
11
  ---
12
 
@@ -19,19 +19,19 @@ should probably proofread and complete it, then remove this comment. -->
19
  axolotl version: `0.4.1`
20
  ```yaml
21
  adapter: lora
22
- base_model: microsoft/Phi-3.5-mini-instruct
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
- - 0e75859cf9d8cc9a_train_data.json
27
  ds_type: json
28
  format: custom
29
- path: 0e75859cf9d8cc9a_train_data.json
30
  type:
31
  field: null
32
- field_input: null
33
- field_instruction: question
34
- field_output: answer
35
  field_system: null
36
  format: null
37
  no_input_format: null
@@ -51,7 +51,7 @@ fsdp_config: null
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
- hub_model_id: FatCat87/bebc098a-34ef-4279-bfd4-76bb95526dba
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
@@ -82,9 +82,9 @@ val_set_size: 0.1
82
  wandb_entity: fatcat87-taopanda
83
  wandb_log_model: null
84
  wandb_mode: online
85
- wandb_name: bebc098a-34ef-4279-bfd4-76bb95526dba
86
  wandb_project: subnet56
87
- wandb_runid: bebc098a-34ef-4279-bfd4-76bb95526dba
88
  wandb_watch: null
89
  warmup_ratio: 0.05
90
  weight_decay: 0.0
@@ -94,11 +94,12 @@ xformers_attention: null
94
 
95
  </details><br>
96
 
97
- # bebc098a-34ef-4279-bfd4-76bb95526dba
 
98
 
99
- This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on the None dataset.
100
  It achieves the following results on the evaluation set:
101
- - Loss: 0.8199
102
 
103
  ## Model description
104
 
@@ -128,23 +129,23 @@ The following hyperparameters were used during training:
128
  - total_eval_batch_size: 4
129
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
130
  - lr_scheduler_type: cosine
131
- - lr_scheduler_warmup_steps: 2
132
  - num_epochs: 1
133
 
134
  ### Training results
135
 
136
  | Training Loss | Epoch | Step | Validation Loss |
137
  |:-------------:|:------:|:----:|:---------------:|
138
- | 4.8381 | 0.0435 | 1 | 0.8313 |
139
- | 4.5187 | 0.2609 | 6 | 0.8197 |
140
- | 4.0981 | 0.5217 | 12 | 0.8163 |
141
- | 3.886 | 0.7826 | 18 | 0.8199 |
142
 
143
 
144
  ### Framework versions
145
 
146
  - PEFT 0.11.1
147
- - Transformers 4.44.2
148
  - Pytorch 2.3.0+cu121
149
  - Datasets 2.19.1
150
  - Tokenizers 0.19.1
 
1
  ---
2
+ license: apache-2.0
3
  library_name: peft
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
7
+ base_model: Qwen/Qwen2.5-0.5B
8
  model-index:
9
+ - name: 6201afaa-a647-4d6e-b7ef-21dbf1764a57
10
  results: []
11
  ---
12
 
 
19
  axolotl version: `0.4.1`
20
  ```yaml
21
  adapter: lora
22
+ base_model: Qwen/Qwen2.5-0.5B
23
  bf16: auto
24
  datasets:
25
  - data_files:
26
+ - e7d5bcb285c8a077_train_data.json
27
  ds_type: json
28
  format: custom
29
+ path: e7d5bcb285c8a077_train_data.json
30
  type:
31
  field: null
32
+ field_input: disorder
33
+ field_instruction: input
34
+ field_output: dssp3
35
  field_system: null
36
  format: null
37
  no_input_format: null
 
51
  gradient_accumulation_steps: 4
52
  gradient_checkpointing: true
53
  group_by_length: false
54
+ hub_model_id: FatCat87/6201afaa-a647-4d6e-b7ef-21dbf1764a57
55
  learning_rate: 0.0002
56
  load_in_4bit: false
57
  load_in_8bit: true
 
82
  wandb_entity: fatcat87-taopanda
83
  wandb_log_model: null
84
  wandb_mode: online
85
+ wandb_name: 6201afaa-a647-4d6e-b7ef-21dbf1764a57
86
  wandb_project: subnet56
87
+ wandb_runid: 6201afaa-a647-4d6e-b7ef-21dbf1764a57
88
  wandb_watch: null
89
  warmup_ratio: 0.05
90
  weight_decay: 0.0
 
94
 
95
  </details><br>
96
 
97
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/rxyx3cte)
98
+ # 6201afaa-a647-4d6e-b7ef-21dbf1764a57
99
 
100
+ This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) on the None dataset.
101
  It achieves the following results on the evaluation set:
102
+ - Loss: 1.1313
103
 
104
  ## Model description
105
 
 
129
  - total_eval_batch_size: 4
130
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
131
  - lr_scheduler_type: cosine
132
+ - lr_scheduler_warmup_steps: 8
133
  - num_epochs: 1
134
 
135
  ### Training results
136
 
137
  | Training Loss | Epoch | Step | Validation Loss |
138
  |:-------------:|:------:|:----:|:---------------:|
139
+ | 1.951 | 0.0051 | 1 | 1.9665 |
140
+ | 1.1794 | 0.2506 | 49 | 1.1707 |
141
+ | 1.124 | 0.5013 | 98 | 1.1384 |
142
+ | 1.1442 | 0.7519 | 147 | 1.1313 |
143
 
144
 
145
  ### Framework versions
146
 
147
  - PEFT 0.11.1
148
+ - Transformers 4.42.3
149
  - Pytorch 2.3.0+cu121
150
  - Datasets 2.19.1
151
  - Tokenizers 0.19.1
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c1b3a76cfdd55b91b883fc7275929ce3c900bb1c70573c0f7d7a78968472fbc5
3
- size 201419466
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85f0f329502f927827eed212d6bff3aca2a61fa1e4be9c07333007fe5f411614
3
+ size 70506570