gemma-3-1b-it-arkey_emails-qlora

This model is a fine-tuned version of google/gemma-3-1b-it on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 10
total_train_batch_size: 20
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training Loss	Epoch	Step	Validation Loss
No log	0.9774	26	3.3697
No log	1.9774	52	2.8943
No log	2.9774	78	2.7022
No log	3.9774	104	2.5675
No log	4.9774	130	2.4533
No log	5.9774	156	2.3564
No log	6.9774	182	2.2832
No log	7.9774	208	2.2290
No log	8.9774	234	2.1967
No log	9.9774	260	2.1858