Ministral-8B-Instruct-2410-PsyCourse-fold7

This model is a fine-tuned version of mistralai/Ministral-8B-Instruct-2410 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0477

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.2582 0.0770 50 0.2416
0.0852 0.1539 100 0.0695
0.0612 0.2309 150 0.0585
0.0568 0.3078 200 0.0547
0.0435 0.3848 250 0.0428
0.0399 0.4617 300 0.0469
0.0436 0.5387 350 0.0451
0.0494 0.6156 400 0.0438
0.0291 0.6926 450 0.0377
0.0285 0.7695 500 0.0388
0.0424 0.8465 550 0.0351
0.0356 0.9234 600 0.0355
0.0296 1.0004 650 0.0370
0.0336 1.0773 700 0.0371
0.0262 1.1543 750 0.0345
0.0285 1.2312 800 0.0335
0.0293 1.3082 850 0.0343
0.0224 1.3851 900 0.0335
0.0366 1.4621 950 0.0333
0.0316 1.5391 1000 0.0365
0.0296 1.6160 1050 0.0322
0.0322 1.6930 1100 0.0353
0.0222 1.7699 1150 0.0327
0.0219 1.8469 1200 0.0355
0.0273 1.9238 1250 0.0325
0.0207 2.0008 1300 0.0310
0.0173 2.0777 1350 0.0315
0.022 2.1547 1400 0.0341
0.0098 2.2316 1450 0.0381
0.0196 2.3086 1500 0.0343
0.0162 2.3855 1550 0.0386
0.0129 2.4625 1600 0.0377
0.0191 2.5394 1650 0.0336
0.0206 2.6164 1700 0.0352
0.0229 2.6933 1750 0.0325
0.0196 2.7703 1800 0.0324
0.0204 2.8472 1850 0.0318
0.0187 2.9242 1900 0.0324
0.023 3.0012 1950 0.0342
0.0084 3.0781 2000 0.0376
0.0104 3.1551 2050 0.0413
0.008 3.2320 2100 0.0392
0.0073 3.3090 2150 0.0386
0.0153 3.3859 2200 0.0368
0.0068 3.4629 2250 0.0363
0.0115 3.5398 2300 0.0377
0.0055 3.6168 2350 0.0394
0.012 3.6937 2400 0.0376
0.0072 3.7707 2450 0.0391
0.007 3.8476 2500 0.0400
0.0098 3.9246 2550 0.0394
0.0079 4.0015 2600 0.0399
0.0014 4.0785 2650 0.0418
0.0071 4.1554 2700 0.0446
0.0017 4.2324 2750 0.0446
0.004 4.3093 2800 0.0466
0.0034 4.3863 2850 0.0474
0.0038 4.4633 2900 0.0478
0.0018 4.5402 2950 0.0475
0.0038 4.6172 3000 0.0477
0.0046 4.6941 3050 0.0476
0.0036 4.7711 3100 0.0477
0.0033 4.8480 3150 0.0476
0.003 4.9250 3200 0.0477

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
2
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for chchen/Ministral-8B-Instruct-2410-PsyCourse-fold7

Adapter
(18)
this model