privacy-300k-masking
This model is a fine-tuned version of distilbert-base-multilingual-cased on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3729
- Overall Precision: 0.2655
- Overall Recall: 0.1856
- Overall F1: 0.2185
- Overall Accuracy: 0.8664
- Bod F1: 0.2060
- Building F1: 0.2527
- Cardissuer F1: 0.0
- City F1: 0.2253
- Country F1: 0.2800
- Date F1: 0.2289
- Driverlicense F1: 0.1902
- Email F1: 0.2350
- Geocoord F1: 0.1572
- Givenname1 F1: 0.2029
- Givenname2 F1: 0.1330
- Idcard F1: 0.2208
- Ip F1: 0.1826
- Lastname1 F1: 0.1877
- Lastname2 F1: 0.0937
- Lastname3 F1: 0.0328
- Pass F1: 0.1950
- Passport F1: 0.2256
- Postcode F1: 0.2518
- Secaddress F1: 0.2101
- Sex F1: 0.2636
- Socialnumber F1: 0.1891
- State F1: 0.2639
- Street F1: 0.1915
- Tel F1: 0.2077
- Time F1: 0.2551
- Title F1: 0.2453
- Username F1: 0.2325
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy | Bod F1 | Building F1 | Cardissuer F1 | City F1 | Country F1 | Date F1 | Driverlicense F1 | Email F1 | Geocoord F1 | Givenname1 F1 | Givenname2 F1 | Idcard F1 | Ip F1 | Lastname1 F1 | Lastname2 F1 | Lastname3 F1 | Pass F1 | Passport F1 | Postcode F1 | Secaddress F1 | Sex F1 | Socialnumber F1 | State F1 | Street F1 | Tel F1 | Time F1 | Title F1 | Username F1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.3954 | 1.0 | 88839 | 0.3729 | 0.2655 | 0.1856 | 0.2185 | 0.8664 | 0.2060 | 0.2527 | 0.0 | 0.2253 | 0.2800 | 0.2289 | 0.1902 | 0.2350 | 0.1572 | 0.2029 | 0.1330 | 0.2208 | 0.1826 | 0.1877 | 0.0937 | 0.0328 | 0.1950 | 0.2256 | 0.2518 | 0.2101 | 0.2636 | 0.1891 | 0.2639 | 0.1915 | 0.2077 | 0.2551 | 0.2453 | 0.2325 |
Framework versions
- Transformers 4.40.0
- Pytorch 2.2.1+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1
- Downloads last month
- 7
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.