Cross-Encoder for MS Marco with TinyBert
This is a fine-tuned version of the model checkpointed at cross-encoder/ms-marco-MiniLM-L-4-v2.
It was fine-tuned on html tags and labels generated using Fathom.
How to use this model in transformers
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Mozilla/tinybert-uncased-autofill"
)
print(
classifier('Card information input Card number cc-number <SEP> <SEP> input First name <SEP> <SEP>')
)
Model Training Info
HyperParameters = {
'learning_rate': 2.3878733582558547e-05,
'num_train_epochs': 21,
'weight_decay': 0.0005288040458920454,
'per_device_train_batch_size': 32
}
More information on how the model was trained can be found here: https://github.com/mozilla/smart_autofill
Model Performance
Test Performance:
Precision: 0.913
Recall: 0.872
F1: 0.887
precision recall f1-score support
cc-csc 0.943 0.950 0.946 139
cc-exp 1.000 0.883 0.938 60
cc-exp-month 0.954 0.922 0.938 90
cc-exp-year 0.904 0.934 0.919 91
cc-name 0.835 0.989 0.905 92
cc-number 0.953 0.970 0.961 167
cc-type 0.920 0.940 0.930 183
email 0.918 0.927 0.922 205
given-name 0.727 0.421 0.533 19
last-name 0.833 0.588 0.690 17
other 0.994 0.994 0.994 8000
postal-code 0.980 0.951 0.965 102
accuracy 0.985 9165
macro avg 0.913 0.872 0.887 9165
weighted avg 0.986 0.985 0.985 9165
- Downloads last month
- 216
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for Mozilla/tinybert-uncased-autofill
Base model
cross-encoder/ms-marco-MiniLM-L-4-v2