locuslab
/

base-smollm2-1.7bscore0_only-600B

saching0071 commited on Mar 2

Commit

6089c94

verified ·

1 Parent(s): eb484a8

Update main README with loading instructions

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-version: final
 family: smollm2-1.7b
 model_name: score0_only-600B
 license: mit
@@ -8,7 +8,7 @@ tags:
   - transformer
   - smollm2
 ---
-# SmolLM2 score0_only-600B (Version: final)
 ## Model Details
 - **Architecture:** SmolLM2
@@ -16,43 +16,31 @@ tags:
 ## Training Configuration
 ```yaml
-attention_logit_softcapping: null
-attention_scores_scalar: null
-attn_bias: false
-bias: false
-block_size: 8192
-final_logit_softcapping: null
-gelu_approximate: none
-head_size: 64
-hf_config:
-  name: SmolLM2-1.7B
-  org: HuggingFaceTB
-intermediate_size: 8192
-lm_head_bias: false
-mlp_class_name: LLaMAMLP
-n_embd: 2048
-n_expert: 0
-n_expert_per_token: 0
-n_head: 32
-n_layer: 24
-n_query_groups: 32
-name: SmolLM2-1.7B
-norm_class_name: RMSNorm
-norm_eps: 1.0e-05
-norm_qk: false
-padded_vocab_size: 49152
-padding_multiple: 512
-parallel_residual: false
-post_attention_norm: false
-post_mlp_norm: false
-rope_adjustments: null
-rope_base: 130000
-rope_condense_ratio: 1
-rotary_percentage: 1.0
-scale_embeddings: false
-shared_attention_norm: false
-sliding_window_layer_placing: null
-sliding_window_size: null
-vocab_size: 49152
 ```

 ---
+version: main
 family: smollm2-1.7b
 model_name: score0_only-600B
 license: mit
   - transformer
   - smollm2
 ---
+# SmolLM2 score0_only-600B (Version: main)
 ## Model Details
 - **Architecture:** SmolLM2
 ## Training Configuration
 ```yaml
+optimizer:
+  class_path: torch.optim.AdamW
+  init_args:
+    lr: 0.0005
+    weight_decay: 0.01
+precision: bf16-mixed
+seed: 42
+train:
+  global_batch_size: 1024
+  max_seq_length: 2048
+  max_tokens: 600000000000
+  micro_batch_size: 8
 ```
+## Model Loading and Revision System
+This repository hosts multiple revisions of the model.
+To load a specific revision, use the `revision` parameter. For example:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("locuslab/score0_only-600B", revision="final")
+tokenizer = AutoTokenizer.from_pretrained("locuslab/score0_only-600B", revision="final")
+```
+Replace `"final"` with the desired revision.