cimol commited on
Commit
5ba3ef4
·
verified ·
1 Parent(s): b8ad888

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -126
README.md CHANGED
@@ -14,121 +14,7 @@ model-index:
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
  [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
17
- <details><summary>See axolotl config</summary>
18
 
19
- axolotl version: `0.4.1`
20
- ```yaml
21
- adapter: lora
22
- base_model: unsloth/SmolLM-360M-Instruct
23
- bf16: true
24
- chat_template: llama3
25
- data_processes: 54
26
- dataset_prepared_path: null
27
- datasets:
28
- - data_files:
29
- - e7d8f23205d40f8d_train_data.json
30
- ds_type: json
31
- format: custom
32
- path: /workspace/input_data/e7d8f23205d40f8d_train_data.json
33
- type:
34
- field_instruction: source
35
- field_output: target
36
- format: '{instruction}'
37
- no_input_format: '{instruction}'
38
- system_format: '{system}'
39
- system_prompt: ''
40
- debug: null
41
- deepspeed: null
42
- device_map: auto
43
- distributed_training:
44
- backend: nccl
45
- multi_gpu: true
46
- num_gpus: 2
47
- do_eval: true
48
- early_stopping_patience: 4
49
- eval_batch_size: 16
50
- eval_max_new_tokens: 128
51
- eval_steps: 150
52
- eval_table_size: null
53
- evals_per_epoch: null
54
- flash_attention: true
55
- fp16: false
56
- fsdp:
57
- - full_shard
58
- fsdp_config:
59
- backward_prefetch: BACKWARD_PRE
60
- cpu_offload: false
61
- forward_prefetch: false
62
- mixed_precision: bf16
63
- sharding_strategy: FULL_SHARD
64
- use_orig_params: true
65
- gradient_accumulation_steps: 1
66
- gradient_checkpointing: true
67
- group_by_length: true
68
- hub_ignore_patterns:
69
- - README.md
70
- - config.json
71
- hub_model_id: cimol/fcef08cb-0f5e-44e1-ba45-9c886c47afab
72
- hub_repo: null
73
- hub_strategy: end
74
- hub_token: null
75
- learning_rate: 7.0e-05
76
- load_in_4bit: false
77
- load_in_8bit: false
78
- local_rank: null
79
- logging_steps: 10
80
- lora_alpha: 128
81
- lora_dropout: 0.1
82
- lora_fan_in_fan_out: null
83
- lora_model_dir: null
84
- lora_r: 64
85
- lora_target_linear: true
86
- lr_scheduler: polynomial
87
- lr_scheduler_warmup_steps: 150
88
- max_grad_norm: 0.5
89
- max_memory:
90
- 0: 75GB
91
- 1: 75GB
92
- max_steps: 1350
93
- micro_batch_size: 16
94
- mlflow_experiment_name: /tmp/e7d8f23205d40f8d_train_data.json
95
- model_type: AutoModelForCausalLM
96
- num_epochs: 3
97
- optim_args:
98
- adam_beta1: 0.9
99
- adam_beta2: 0.95
100
- adam_epsilon: 1e-8
101
- optimizer: adamw_torch
102
- output_dir: miner_id_24
103
- pad_to_sequence_len: true
104
- resume_from_checkpoint: null
105
- s2_attention: null
106
- sample_packing: false
107
- save_steps: 300
108
- saves_per_epoch: null
109
- seed: 17333
110
- sequence_len: 1024
111
- strict: false
112
- tf32: true
113
- tokenizer_type: AutoTokenizer
114
- total_train_batch_size: 32
115
- train_batch_size: 32
116
- train_on_inputs: false
117
- trust_remote_code: true
118
- val_set_size: 0.05
119
- wandb_entity: null
120
- wandb_mode: online
121
- wandb_name: be621283-b0f9-4449-8928-03752a4a386f
122
- wandb_project: Gradients-On-Demand
123
- wandb_run: your_name
124
- wandb_runid: be621283-b0f9-4449-8928-03752a4a386f
125
- warmup_steps: 150
126
- weight_decay: 0.01
127
- xformers_attention: null
128
-
129
- ```
130
-
131
- </details><br>
132
 
133
  # fcef08cb-0f5e-44e1-ba45-9c886c47afab
134
 
@@ -153,18 +39,7 @@ More information needed
153
  ### Training hyperparameters
154
 
155
  The following hyperparameters were used during training:
156
- - learning_rate: 7e-05
157
- - train_batch_size: 16
158
- - eval_batch_size: 16
159
- - seed: 17333
160
- - distributed_type: multi-GPU
161
- - num_devices: 2
162
- - total_train_batch_size: 32
163
- - total_eval_batch_size: 32
164
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-8
165
- - lr_scheduler_type: polynomial
166
- - lr_scheduler_warmup_steps: 150
167
- - training_steps: 1350
168
 
169
  ### Training results
170
 
 
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
  [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  # fcef08cb-0f5e-44e1-ba45-9c886c47afab
20
 
 
39
  ### Training hyperparameters
40
 
41
  The following hyperparameters were used during training:
42
+
 
 
 
 
 
 
 
 
 
 
 
43
 
44
  ### Training results
45