Safetensors
English
NLP
TristanBehrens commited on
Commit
f8a1408
·
verified ·
1 Parent(s): 0d9e0c8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - NLP
6
+ license: mit
7
+ datasets:
8
+ - TristanBehrens/metal_cluster
9
+ - TristanBehrens/metal_arrangement
10
+ - TristanBehrens/metal_interleaved
11
+ base_model: None
12
+ ---
13
+
14
+ # metal_omni_two - An xLSTM Model
15
+
16
+ ![Trained with Helibrunna](banner.jpg)
17
+
18
+ Trained with [Helibrunna](https://github.com/AI-Guru/helibrunna) by [Dr. Tristan Behrens](https://de.linkedin.com/in/dr-tristan-behrens-734967a2).
19
+
20
+ ## Configuration
21
+
22
+ ```
23
+ training:
24
+ model_name: metal_omni_two
25
+ batch_size: 28
26
+ lr: 0.001
27
+ lr_warmup_steps: 1445
28
+ lr_decay_until_steps: 14455
29
+ lr_decay_factor: 0.001
30
+ weight_decay: 0.1
31
+ amp_precision: bfloat16
32
+ weight_precision: float32
33
+ enable_mixed_precision: true
34
+ num_epochs: 5
35
+ output_dir: output/metal_omni_two
36
+ save_every_step: 500
37
+ log_every_step: 10
38
+ wandb_project: tonnetz
39
+ torch_compile: false
40
+ model:
41
+ type: llamathree
42
+ context_length: 2048
43
+ emb_dim: 256
44
+ n_heads: 4
45
+ n_layers: 6
46
+ hidden_dim: 128
47
+ hidden_activation: silu
48
+ n_kv_groups: 1
49
+ rope_base: 50000
50
+ rope_freq: null
51
+ dtype: float32
52
+ vocab_size: 269
53
+ dataset:
54
+ hugging_face_ids:
55
+ - TristanBehrens/metal_cluster
56
+ - TristanBehrens/metal_arrangement
57
+ - TristanBehrens/metal_interleaved
58
+ tokenizer:
59
+ type: whitespace
60
+ fill_token: '[EOS]'
61
+
62
+ ```