--- license: gemma base_model: google/gemma-2-9b tags: - trl - sft - generated_from_trainer model-index: - name: collapse_gemma-2-9b_hs2_replace_iter3_sftsd0 results: [] --- # collapse_gemma-2-9b_hs2_replace_iter3_sftsd0 This model is a fine-tuned version of [google/gemma-2-9b](https://huggingface.co/google/gemma-2-9b) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 1.3146 - Num Input Tokens Seen: 4629248 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8e-06 - train_batch_size: 4 - eval_batch_size: 16 - seed: 0 - gradient_accumulation_steps: 32 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant_with_warmup - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 1 ### Training results | Training Loss | Epoch | Step | Validation Loss | Input Tokens Seen | |:-------------:|:------:|:----:|:---------------:|:-----------------:| | No log | 0 | 0 | 1.2335 | 0 | | 1.1505 | 0.0514 | 5 | 1.0838 | 237896 | | 0.5751 | 0.1028 | 10 | 1.1218 | 477308 | | 0.257 | 0.1541 | 15 | 1.1457 | 712540 | | 0.0988 | 0.2055 | 20 | 1.2485 | 950820 | | 0.073 | 0.2569 | 25 | 1.3308 | 1186016 | | 0.0413 | 0.3083 | 30 | 1.2650 | 1434072 | | 0.0789 | 0.3597 | 35 | 1.2347 | 1670144 | | 0.0581 | 0.4110 | 40 | 1.1793 | 1906636 | | 0.0455 | 0.4624 | 45 | 1.1711 | 2150924 | | 0.0256 | 0.5138 | 50 | 1.2403 | 2385516 | | 0.0448 | 0.5652 | 55 | 1.2306 | 2630840 | | 0.0498 | 0.6166 | 60 | 1.2123 | 2870696 | | 0.0334 | 0.6680 | 65 | 1.2054 | 3105528 | | 0.0319 | 0.7193 | 70 | 1.2169 | 3337616 | | 0.0269 | 0.7707 | 75 | 1.2902 | 3571224 | | 0.0304 | 0.8221 | 80 | 1.2938 | 3813928 | | 0.0335 | 0.8735 | 85 | 1.2884 | 4054540 | | 0.0272 | 0.9249 | 90 | 1.2766 | 4299016 | | 0.0282 | 0.9762 | 95 | 1.3041 | 4533368 | ### Framework versions - Transformers 4.44.0 - Pytorch 2.4.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1