Mariyam123 commited on
Commit
accba54
·
verified ·
1 Parent(s): d08bea5

End of training

Browse files
Files changed (2) hide show
  1. README.md +6 -6
  2. tokenizer.json +2 -16
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
- base_model: answerdotai/ModernBERT-base
5
  tags:
6
  - generated_from_trainer
7
  metrics:
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # modernbert-llm-router
18
 
19
- This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
  - Loss: nan
22
  - F1: 0.2648
@@ -39,8 +39,8 @@ More information needed
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 5e-05
42
- - train_batch_size: 10
43
- - eval_batch_size: 10
44
  - seed: 42
45
  - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
@@ -50,8 +50,8 @@ The following hyperparameters were used during training:
50
 
51
  | Training Loss | Epoch | Step | Validation Loss | F1 |
52
  |:-------------:|:-----:|:----:|:---------------:|:------:|
53
- | 0.0 | 1.0 | 1531 | nan | 0.2648 |
54
- | 0.0 | 2.0 | 3062 | nan | 0.2648 |
55
 
56
 
57
  ### Framework versions
 
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
+ base_model: answerdotai/ModernBERT-large
5
  tags:
6
  - generated_from_trainer
7
  metrics:
 
16
 
17
  # modernbert-llm-router
18
 
19
+ This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
  - Loss: nan
22
  - F1: 0.2648
 
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 5e-05
42
+ - train_batch_size: 4
43
+ - eval_batch_size: 4
44
  - seed: 42
45
  - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
 
50
 
51
  | Training Loss | Epoch | Step | Validation Loss | F1 |
52
  |:-------------:|:-----:|:----:|:---------------:|:------:|
53
+ | 0.0 | 1.0 | 3827 | nan | 0.2648 |
54
+ | 0.0 | 2.0 | 7654 | nan | 0.2648 |
55
 
56
 
57
  ### Framework versions
tokenizer.json CHANGED
@@ -1,21 +1,7 @@
1
  {
2
  "version": "1.0",
3
- "truncation": {
4
- "direction": "Right",
5
- "max_length": 1024,
6
- "strategy": "LongestFirst",
7
- "stride": 0
8
- },
9
- "padding": {
10
- "strategy": {
11
- "Fixed": 1024
12
- },
13
- "direction": "Right",
14
- "pad_to_multiple_of": null,
15
- "pad_id": 50283,
16
- "pad_type_id": 0,
17
- "pad_token": "[PAD]"
18
- },
19
  "added_tokens": [
20
  {
21
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,