urarik
/

t5-asr-CV16

@@ -1,60 +1,73 @@
----
-library_name: transformers
-license: apache-2.0
-base_model: google/umt5-small
-tags:
-- generated_from_trainer
-model-index:
-- name: t5-asr-CV16
-  results: []
----
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# t5-asr-CV16
-This model is a fine-tuned version of [google/umt5-small](https://huggingface.co/google/umt5-small) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- eval_loss: 1.5286
-- eval_wer: 0.9694
-- eval_runtime: 506.6751
-- eval_samples_per_second: 3.947
-- eval_steps_per_second: 0.987
-- epoch: 3.9956
-- step: 3320
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 4
-- eval_batch_size: 4
-- seed: 42
-- gradient_accumulation_steps: 32
-- total_train_batch_size: 128
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 5
-### Framework versions
-- Transformers 4.48.3
-- Pytorch 2.2.1+cu121
-- Datasets 2.17.1
-- Tokenizers 0.21.0

+---
+library_name: transformers
+license: apache-2.0
+base_model: urarik/t5-asr-CV16
+tags:
+- generated_from_trainer
+metrics:
+- wer
+model-index:
+- name: t5-asr-CV16
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# t5-asr-CV16
+This model is a fine-tuned version of [urarik/t5-asr-CV16](https://huggingface.co/urarik/t5-asr-CV16) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.3355
+- Wer: 0.9169
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 42
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 512
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 10
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Wer    |
+|:-------------:|:------:|:----:|:---------------:|:------:|
+| 3.0643        | 0.9967 | 207  | 1.5132          | 0.9353 |
+| 2.9128        | 1.9967 | 414  | 1.4829          | 0.9332 |
+| 2.9411        | 2.9967 | 621  | 1.4659          | 0.9319 |
+| 2.7918        | 3.9967 | 828  | 1.4447          | 0.9332 |
+| 2.8172        | 4.9967 | 1035 | 1.4313          | 0.9283 |
+| 2.7944        | 5.9967 | 1242 | 1.4219          | 0.9273 |
+| 2.8294        | 6.9967 | 1449 | 1.4083          | 0.9273 |
+| 2.7507        | 7.9967 | 1656 | 1.3781          | 0.9273 |
+| 2.6134        | 8.9967 | 1863 | 1.3522          | 0.9213 |
+| 2.6169        | 9.9967 | 2070 | 1.3355          | 0.9169 |
+### Framework versions
+- Transformers 4.48.3
+- Pytorch 2.5.1+cu121
+- Datasets 2.17.1
+- Tokenizers 0.21.0

generation_config.json CHANGED Viewed

@@ -1,7 +1,7 @@
-{
-  "decoder_start_token_id": 0,
-  "eos_token_id": 1,
-  "max_new_tokens": 64,
-  "pad_token_id": 0,
-  "transformers_version": "4.48.3"
-}

+{
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "max_new_tokens": 64,
+  "pad_token_id": 0,
+  "transformers_version": "4.48.3"
+}