Mariyam123
/

modernbert-llm-router

Text Classification

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

Mariyam123 commited on Feb 5

Commit

accba54

·

verified ·

1 Parent(s): d08bea5

End of training

Files changed (2) hide show

README.md +6 -6
tokenizer.json +2 -16

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 library_name: transformers
 license: apache-2.0
-base_model: answerdotai/ModernBERT-base
 tags:
 - generated_from_trainer
 metrics:
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 # modernbert-llm-router
-This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: nan
 - F1: 0.2648
@@ -39,8 +39,8 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 10
-- eval_batch_size: 10
 - seed: 42
 - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
@@ -50,8 +50,8 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
-| 0.0           | 1.0   | 1531 | nan             | 0.2648 |
-| 0.0           | 2.0   | 3062 | nan             | 0.2648 |
 ### Framework versions

 ---
 library_name: transformers
 license: apache-2.0
+base_model: answerdotai/ModernBERT-large
 tags:
 - generated_from_trainer
 metrics:
 # modernbert-llm-router
+This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: nan
 - F1: 0.2648
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 4
+- eval_batch_size: 4
 - seed: 42
 - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 | Training Loss | Epoch | Step | Validation Loss | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
+| 0.0           | 1.0   | 3827 | nan             | 0.2648 |
+| 0.0           | 2.0   | 7654 | nan             | 0.2648 |
 ### Framework versions

tokenizer.json CHANGED Viewed

@@ -1,21 +1,7 @@
 {
   "version": "1.0",
-  "truncation": {
-    "direction": "Right",
-    "max_length": 1024,
-    "strategy": "LongestFirst",
-    "stride": 0
-  },
-  "padding": {
-    "strategy": {
-      "Fixed": 1024
-    },
-    "direction": "Right",
-    "pad_to_multiple_of": null,
-    "pad_id": 50283,
-    "pad_type_id": 0,
-    "pad_token": "[PAD]"
-  },
   "added_tokens": [
     {
       "id": 0,

 {
   "version": "1.0",
+  "truncation": null,
+  "padding": null,
   "added_tokens": [
     {
       "id": 0,