La1ya commited on
Commit
13d770a
·
verified ·
1 Parent(s): f0bfcac

Model save

Browse files
Files changed (3) hide show
  1. README.md +19 -8
  2. generation_config.json +1 -1
  3. model.safetensors +1 -1
README.md CHANGED
@@ -15,6 +15,8 @@ should probably proofread and complete it, then remove this comment. -->
15
  # english-hindi-colloquial-translator
16
 
17
  This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-hi](https://huggingface.co/Helsinki-NLP/opus-mt-en-hi) on the None dataset.
 
 
18
 
19
  ## Model description
20
 
@@ -33,23 +35,32 @@ More information needed
33
  ### Training hyperparameters
34
 
35
  The following hyperparameters were used during training:
36
- - learning_rate: 0.0005
37
- - train_batch_size: 8
38
- - eval_batch_size: 8
39
  - seed: 42
40
- - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 
 
41
  - lr_scheduler_type: linear
42
- - lr_scheduler_warmup_steps: 10
43
- - num_epochs: 3
44
  - mixed_precision_training: Native AMP
45
 
46
  ### Training results
47
 
 
 
 
 
 
 
 
48
 
49
 
50
  ### Framework versions
51
 
52
- - Transformers 4.48.3
53
  - Pytorch 2.6.0+cu124
54
- - Datasets 3.3.1
55
  - Tokenizers 0.21.0
 
15
  # english-hindi-colloquial-translator
16
 
17
  This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-hi](https://huggingface.co/Helsinki-NLP/opus-mt-en-hi) on the None dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 1.6868
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 0.0003
39
+ - train_batch_size: 4
40
+ - eval_batch_size: 4
41
  - seed: 42
42
+ - gradient_accumulation_steps: 2
43
+ - total_train_batch_size: 8
44
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
  - lr_scheduler_type: linear
46
+ - lr_scheduler_warmup_steps: 2
47
+ - num_epochs: 10
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
51
 
52
+ | Training Loss | Epoch | Step | Validation Loss |
53
+ |:-------------:|:-----:|:----:|:---------------:|
54
+ | 17.5249 | 2.0 | 2 | 9.0542 |
55
+ | 17.3367 | 4.0 | 4 | 9.0542 |
56
+ | 9.4521 | 6.0 | 6 | 3.3342 |
57
+ | 5.0478 | 8.0 | 8 | 2.1933 |
58
+ | 3.1079 | 10.0 | 10 | 1.6868 |
59
 
60
 
61
  ### Framework versions
62
 
63
+ - Transformers 4.47.1
64
  - Pytorch 2.6.0+cu124
65
+ - Datasets 3.3.2
66
  - Tokenizers 0.21.0
generation_config.json CHANGED
@@ -12,5 +12,5 @@
12
  "num_beams": 4,
13
  "pad_token_id": 61949,
14
  "renormalize_logits": true,
15
- "transformers_version": "4.48.3"
16
  }
 
12
  "num_beams": 4,
13
  "pad_token_id": 61949,
14
  "renormalize_logits": true,
15
+ "transformers_version": "4.47.1"
16
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:35d6b11291116179fcff0594f5f49a6c5e3071949b54c18cdca56a6d28e42ca7
3
  size 303704440
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2ca68bff28c4e1093f35e86dc85f80da84ca4fd4cd33cb7944ad70323851cc18
3
  size 303704440