Uploaded model

Developed by: deshanksuman
License: apache-2.0
Finetuned from model : Qwen/Qwen3-4B

Dataset

Fews Training data arranged in the format of Instruction, Input and output with advanced Reasonining for sense identification. The data generation has been semi automated using the Arcee models. The data has been validated by the human for it's structure and the correctnes.

The data source can be accessed here: deshanksuman/Reasoning_WSD_dataset

Hyperparameter for Training

per_device_train_batch_size=4,
gradient_accumulation_steps=8,
warmup_steps=50,
num_train_epochs=2,
learning_rate=2e-4,
fp16=not torch.cuda.is_bf16_supported(),
bf16=torch.cuda.is_bf16_supported(),
logging_steps=10,
optim="adamw_torch",
weight_decay=0.01,
lr_scheduler_type="linear",
seed=3407

This is developed by Deshan Sumanathilaka https://sumanathilaka.github.io

Acknowledgement

We acknowledge the support of the Supercomputing Wales project, which is part-funded by the European Regional Development Fund (ERDF) via Welsh Government.

deshanksuman
/

finetunedQwen3-4B-Instruct-WSD-Advanced-reasoning

Uploaded model

Dataset

Hyperparameter for Training

Acknowledgement

Model tree for deshanksuman/finetunedQwen3-4B-Instruct-WSD-Advanced-reasoning

Dataset used to train deshanksuman/finetunedQwen3-4B-Instruct-WSD-Advanced-reasoning