c-ho commited on
Commit
9e033fc
·
verified ·
1 Parent(s): 2fd769b

multilingual_dbert_linsearch_only_abstract

Browse files
Files changed (3) hide show
  1. README.md +16 -28
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -18,11 +18,11 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [distilbert/distilbert-base-multilingual-cased](https://huggingface.co/distilbert/distilbert-base-multilingual-cased) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 1.1511
22
- - Accuracy: 0.6464
23
- - F1 Macro: 0.5645
24
- - Precision Macro: 0.5692
25
- - Recall Macro: 0.5638
26
 
27
  ## Model description
28
 
@@ -41,37 +41,25 @@ More information needed
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
44
- - learning_rate: 7e-05
45
- - train_batch_size: 8
46
- - eval_batch_size: 8
47
  - seed: 42
48
- - gradient_accumulation_steps: 8
49
- - total_train_batch_size: 64
50
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
51
  - lr_scheduler_type: cosine
52
  - lr_scheduler_warmup_ratio: 0.2
53
- - num_epochs: 15
54
  - mixed_precision_training: Native AMP
55
 
56
  ### Training results
57
 
58
- | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Macro | Precision Macro | Recall Macro |
59
- |:-------------:|:-------:|:-----:|:---------------:|:--------:|:--------:|:---------------:|:------------:|
60
- | 2.9428 | 1.0 | 1233 | 1.9022 | 0.4889 | 0.2571 | 0.3154 | 0.2787 |
61
- | 1.7733 | 2.0 | 2466 | 1.3386 | 0.6098 | 0.4794 | 0.5232 | 0.4761 |
62
- | 1.358 | 3.0 | 3699 | 1.1918 | 0.6336 | 0.5370 | 0.5488 | 0.5434 |
63
- | 1.179 | 4.0 | 4932 | 1.1346 | 0.6484 | 0.5628 | 0.5637 | 0.5718 |
64
- | 0.9919 | 5.0 | 6165 | 1.1190 | 0.6485 | 0.5627 | 0.5654 | 0.5663 |
65
- | 0.9091 | 6.0 | 7398 | 1.1405 | 0.6488 | 0.5571 | 0.5663 | 0.5577 |
66
- | 0.8513 | 7.0 | 8631 | 1.1511 | 0.6464 | 0.5645 | 0.5692 | 0.5638 |
67
- | 0.7866 | 8.0 | 9864 | 1.1770 | 0.6431 | 0.5639 | 0.5626 | 0.5693 |
68
- | 0.6762 | 9.0 | 11097 | 1.2116 | 0.6432 | 0.5594 | 0.5625 | 0.5593 |
69
- | 0.6205 | 10.0 | 12330 | 1.2403 | 0.6418 | 0.5593 | 0.5600 | 0.5611 |
70
- | 0.5872 | 11.0 | 13563 | 1.2767 | 0.6368 | 0.5588 | 0.5592 | 0.5604 |
71
- | 0.5513 | 12.0 | 14796 | 1.2979 | 0.6352 | 0.5524 | 0.5557 | 0.5510 |
72
- | 0.5125 | 13.0 | 16029 | 1.3053 | 0.6354 | 0.5543 | 0.5562 | 0.5538 |
73
- | 0.497 | 14.0 | 17262 | 1.3137 | 0.6355 | 0.5563 | 0.5578 | 0.5563 |
74
- | 0.4911 | 14.9881 | 18480 | 1.3135 | 0.6342 | 0.5550 | 0.5571 | 0.5543 |
75
 
76
 
77
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [distilbert/distilbert-base-multilingual-cased](https://huggingface.co/distilbert/distilbert-base-multilingual-cased) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 1.5452
22
+ - Accuracy: 0.6465
23
+ - F1 Macro: 0.5744
24
+ - Precision Macro: 0.5998
25
+ - Recall Macro: 0.5660
26
 
27
  ## Model description
28
 
 
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
44
+ - learning_rate: 2e-05
45
+ - train_batch_size: 4
46
+ - eval_batch_size: 4
47
  - seed: 42
 
 
48
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
  - lr_scheduler_type: cosine
50
  - lr_scheduler_warmup_ratio: 0.2
51
+ - num_epochs: 5
52
  - mixed_precision_training: Native AMP
53
 
54
  ### Training results
55
 
56
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Macro | Precision Macro | Recall Macro |
57
+ |:-------------:|:-----:|:-----:|:---------------:|:--------:|:--------:|:---------------:|:------------:|
58
+ | 1.3208 | 1.0 | 19722 | 1.2841 | 0.6142 | 0.5160 | 0.5368 | 0.5359 |
59
+ | 1.1135 | 2.0 | 39444 | 1.1921 | 0.6449 | 0.5597 | 0.5673 | 0.5575 |
60
+ | 0.8989 | 3.0 | 59166 | 1.2967 | 0.6495 | 0.5643 | 0.5834 | 0.5573 |
61
+ | 0.7155 | 4.0 | 78888 | 1.5452 | 0.6465 | 0.5744 | 0.5998 | 0.5660 |
62
+ | 0.5373 | 5.0 | 98610 | 1.7780 | 0.6400 | 0.5669 | 0.5895 | 0.5605 |
 
 
 
 
 
 
 
 
 
 
63
 
64
 
65
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:26b813bacefe316752143dff46269ac436821eb27b66cb1f8c363384a71292a3
3
  size 541400436
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:92111363ce8f6cd3047fe9c340b723eb8bf2f1427879f3aa663b4d86454146e4
3
  size 541400436
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1b137bedfce466dd412852ac5b1386e1a88bea24821c952df80ac37205ec86f1
3
  size 5304
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a21a84c512d8f40c17f03f41c850a81f03dc2a0b242a694406c7da1e1f2e7c7
3
  size 5304