run_gemma-2-2b_20250507_202421-intent-cls

Browse files

Files changed (4) hide show

README.md +86 -0
adapter_config.json +40 -0
adapter_model.safetensors +3 -0
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,86 @@

+---
+library_name: peft
+license: gemma
+base_model: google/gemma-2-2b
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: run_gemma-2-2b_20250507_202421
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# run_gemma-2-2b_20250507_202421
+This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5589
+- Accuracy: 0.7986
+- Precision General: 0.7985
+- Recall General: 0.9722
+- F1 General: 0.8768
+- Precision Memo: 0.8
+- Recall Memo: 0.3333
+- F1 Memo: 0.4706
+- Precision Album: 0.0
+- Recall Album: 0.0
+- F1 Album: 0.0
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 16
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 15
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision General | Recall General | F1 General | Precision Memo | Recall Memo | F1 Memo | Precision Album | Recall Album | F1 Album |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:-----------------:|:--------------:|:----------:|:--------------:|:-----------:|:-------:|:---------------:|:------------:|:--------:|
+| 0.8016        | 1.0   | 147  | 1.0285          | 0.5274   | 0.816             | 0.4744         | 0.6        | 0.3114         | 0.7222      | 0.4351  | 0.0             | 0.0          | 0.0      |
+| 0.5683        | 2.0   | 294  | 0.6889          | 0.75     | 0.8025            | 0.8884         | 0.8433     | 0.5185         | 0.3889      | 0.4444  | 0.0             | 0.0          | 0.0      |
+| 0.5543        | 3.0   | 441  | 0.5514          | 0.7705   | 0.7891            | 0.9395         | 0.8577     | 0.6389         | 0.3194      | 0.4259  | 0.0             | 0.0          | 0.0      |
+| 0.4472        | 4.0   | 588  | 0.5336          | 0.8151   | 0.8132            | 0.9721         | 0.8856     | 0.8286         | 0.4028      | 0.5421  | 0.0             | 0.0          | 0.0      |
+| 0.4579        | 5.0   | 735  | 0.6379          | 0.75     | 0.8114            | 0.8605         | 0.8352     | 0.5312         | 0.4722      | 0.5     | 0.0             | 0.0          | 0.0      |
+| 0.5291        | 6.0   | 882  | 0.6045          | 0.8048   | 0.8062            | 0.9674         | 0.8795     | 0.7941         | 0.375       | 0.5094  | 0.0             | 0.0          | 0.0      |
+| 0.4397        | 7.0   | 1029 | 0.5941          | 0.8116   | 0.8293            | 0.9488         | 0.8850     | 0.7174         | 0.4583      | 0.5593  | 0.0             | 0.0          | 0.0      |
+| 0.3478        | 8.0   | 1176 | 0.6598          | 0.8151   | 0.8061            | 0.9860         | 0.8870     | 0.8966         | 0.3611      | 0.5149  | 0.0             | 0.0          | 0.0      |
+| 0.4765        | 9.0   | 1323 | 0.6486          | 0.8082   | 0.8069            | 0.9721         | 0.8819     | 0.8182         | 0.375       | 0.5143  | 0.0             | 0.0          | 0.0      |
+| 0.3421        | 10.0  | 1470 | 0.6713          | 0.8082   | 0.8               | 0.9860         | 0.8833     | 0.8889         | 0.3333      | 0.4848  | 0.0             | 0.0          | 0.0      |
+| 0.304         | 11.0  | 1617 | 0.6890          | 0.8048   | 0.8038            | 0.9721         | 0.88       | 0.8125         | 0.3611      | 0.5     | 0.0             | 0.0          | 0.0      |
+| 0.3041        | 12.0  | 1764 | 0.6821          | 0.8014   | 0.8054            | 0.9628         | 0.8771     | 0.7714         | 0.375       | 0.5047  | 0.0             | 0.0          | 0.0      |
+| 0.3565        | 13.0  | 1911 | 0.6882          | 0.8048   | 0.8086            | 0.9628         | 0.8790     | 0.7778         | 0.3889      | 0.5185  | 0.0             | 0.0          | 0.0      |
+| 0.3987        | 14.0  | 2058 | 0.6888          | 0.8014   | 0.8054            | 0.9628         | 0.8771     | 0.7714         | 0.375       | 0.5047  | 0.0             | 0.0          | 0.0      |
+| 0.338         | 15.0  | 2205 | 0.6909          | 0.8014   | 0.8054            | 0.9628         | 0.8771     | 0.7714         | 0.375       | 0.5047  | 0.0             | 0.0          | 0.0      |
+### Framework versions
+- PEFT 0.15.0
+- Transformers 4.50.0.dev0
+- Pytorch 2.6.0+cu124
+- Datasets 3.4.1
+- Tokenizers 0.21.1

adapter_config.json ADDED Viewed

	@@ -0,0 +1,40 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "google/gemma-2-2b",
+  "bias": "none",
+  "corda_config": null,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 32,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": [
+    "classifier",
+    "score"
+  ],
+  "peft_type": "LORA",
+  "r": 8,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "up_proj",
+    "q_proj",
+    "down_proj",
+    "v_proj",
+    "o_proj"
+  ],
+  "task_type": "SEQ_CLS",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ad71579cfc45efe3d5af0b81e8c57249983e530d568d4855bd614aa01692065d
+size 29241824

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2f1deaafeac090b54b5473359df85a66f1e0600f923b20de5f11ac2440df74ec
+size 5368