BERT-Router-large-v2

This model is a fine-tuned version of google-bert/bert-large-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 1024
eval_batch_size: 1024
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Auc	Accuracy	Validation Loss
0.3719	1.0	12	0.949	0.835	0.3895
0.3741	2.0	24	0.949	0.835	0.3886
0.3673	3.0	36	0.949	0.836	0.3879
0.3692	4.0	48	0.949	0.836	0.3873
0.3724	5.0	60	0.3866	0.836	0.95
0.3683	6.0	72	0.3859	0.836	0.95
0.3678	7.0	84	0.3853	0.836	0.95
0.3671	8.0	96	0.3847	0.837	0.95
0.3614	9.0	108	0.3842	0.837	0.95
0.3658	10.0	120	0.3838	0.837	0.95
0.3681	11.0	132	0.3834	0.837	0.95
0.3642	12.0	144	0.3831	0.837	0.95
0.3659	13.0	156	0.3827	0.837	0.95
0.3693	14.0	168	0.3823	0.838	0.95
0.3637	15.0	180	0.3820	0.838	0.951
0.3596	16.0	192	0.3819	0.838	0.951
0.3732	17.0	204	0.3817	0.838	0.951
0.3685	18.0	216	0.3816	0.838	0.951
0.3613	19.0	228	0.3815	0.838	0.951
0.3656	20.0	240	0.3815	0.838	0.951