Uploaded model
- Developed by: deshanksuman
- License: apache-2.0
- Finetuned from model : google/gemma-2-2b-it
Dataset
Fews Training data arranged in the format of Instruction, Input and output deshanksuman/Instruct_Finetune_with_Reasoning_WSD
Hyperparameter for Training
- per_device_train_batch_size=4,
- gradient_accumulation_steps=8,
- warmup_steps=50,
- num_train_epochs=1,
- learning_rate=2e-4,
- fp16=not torch.cuda.is_bf16_supported(),
- bf16=torch.cuda.is_bf16_supported(),
- logging_steps=10,
- optim="adamw_torch",
- weight_decay=0.01,
- lr_scheduler_type="linear",
- seed=3407
This is developed by Deshan Sumanathilaka https://sumanathilaka.github.io
Acknowledgement
We acknowledge the support of the Supercomputing Wales project, which is part-funded by the European Regional Development Fund (ERDF) via Welsh Government.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support