privacy-200k-masking
This model is a fine-tuned version of distilbert-base-multilingual-cased on an unknown dataset. It achieves the following results on the evaluation set:
- eval_loss: 0.0949
- eval_overall_precision: 0.9099
- eval_overall_recall: 0.9306
- eval_overall_f1: 0.9201
- eval_overall_accuracy: 0.9692
- eval_ACCOUNTNAME_f1: 0.9863
- eval_ACCOUNTNUMBER_f1: 0.9551
- eval_AGE_f1: 0.9454
- eval_AMOUNT_f1: 0.9481
- eval_BIC_f1: 0.9140
- eval_BITCOINADDRESS_f1: 0.9227
- eval_BUILDINGNUMBER_f1: 0.9056
- eval_CITY_f1: 0.9351
- eval_COMPANYNAME_f1: 0.9621
- eval_COUNTY_f1: 0.9756
- eval_CREDITCARDCVV_f1: 0.9201
- eval_CREDITCARDISSUER_f1: 0.9767
- eval_CREDITCARDNUMBER_f1: 0.8506
- eval_CURRENCY_f1: 0.7277
- eval_CURRENCYCODE_f1: 0.8398
- eval_CURRENCYNAME_f1: 0.1576
- eval_CURRENCYSYMBOL_f1: 0.9216
- eval_DATE_f1: 0.7988
- eval_DOB_f1: 0.6103
- eval_EMAIL_f1: 0.9862
- eval_ETHEREUMADDRESS_f1: 0.9624
- eval_EYECOLOR_f1: 0.9779
- eval_FIRSTNAME_f1: 0.9636
- eval_GENDER_f1: 0.9852
- eval_HEIGHT_f1: 0.9771
- eval_IBAN_f1: 0.9513
- eval_IP_f1: 0.0
- eval_IPV4_f1: 0.8240
- eval_IPV6_f1: 0.7389
- eval_JOBAREA_f1: 0.9713
- eval_JOBTITLE_f1: 0.9819
- eval_JOBTYPE_f1: 0.9743
- eval_LASTNAME_f1: 0.9439
- eval_LITECOINADDRESS_f1: 0.8069
- eval_MAC_f1: 0.9668
- eval_MASKEDNUMBER_f1: 0.8084
- eval_MIDDLENAME_f1: 0.9401
- eval_NEARBYGPSCOORDINATE_f1: 0.9963
- eval_ORDINALDIRECTION_f1: 0.9904
- eval_PASSWORD_f1: 0.9690
- eval_PHONEIMEI_f1: 0.9842
- eval_PHONENUMBER_f1: 0.9690
- eval_PIN_f1: 0.8584
- eval_PREFIX_f1: 0.9594
- eval_SECONDARYADDRESS_f1: 0.9880
- eval_SEX_f1: 0.9952
- eval_SSN_f1: 0.9813
- eval_STATE_f1: 0.9664
- eval_STREET_f1: 0.9607
- eval_TIME_f1: 0.9560
- eval_URL_f1: 0.9866
- eval_USERAGENT_f1: 0.9901
- eval_USERNAME_f1: 0.9743
- eval_VEHICLEVIN_f1: 0.9699
- eval_VEHICLEVRM_f1: 0.9725
- eval_ZIPCODE_f1: 0.9018
- eval_runtime: 3609.2787
- eval_samples_per_second: 17.394
- eval_steps_per_second: 8.697
- epoch: 1.0
- step: 73241
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 2
Framework versions
- Transformers 4.40.0
- Pytorch 2.2.1+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1
- Downloads last month
- 113
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.